You are here
Making a Robust and Useful Earth Science Ontology Repository: Creating a test suite, automating ontology uploading, and standardizing RESTful API
Booma is a PhD student in Computer Science at the University of Illinois at Chicago. She holds an M.S. in Software Systems from Birla Institute of Technology and Science, India. Before stepping into her PhD, she worked as a Software Engineering Senior Analyst for a leading software service provider. Her research interests include Semantic Web, Information Retrieval and Data Mining. She is an avid reader and enjoys science fiction.
The Earth Science Ontology Repository (ESOR) portal contains many vocabularies that are important and useful for sharing earth science data. It serves a similar function as BioPortal that contains a collection of ontologies that support biology, health, and life sciences research, but with a focus on the Earth Science domain. DataONE will benefit from a collection of ontologies with well defined terms that are used in earth science data so that earth science data may be integrated in a correct and consistent manner and also so that search services may be enhanced. Search over the Earth Science Ontology Repository is “smart” in that Its implementation is not based only on keyword search; semantic techniques are also involved so that the search functionality can actually “understand” the meaning of terms. ESOR can be used as the backend knowledge base for multiple applications -- for example, semi-automatic or automatic entity matching.
In order to turn the Earth Science Ontology Repository into a product, we need to create unit tests - a separate stand alone testing capability using JUnit, so that we can be confident it can handle the different use cases for different situations. This test suite will allow automatic testing of updates so that the repository can grow with minimal human effort and a level of consistency can be guaranteed.
In order for this repository to be sustainable, we also need to have simple and automatic (or at least semi-automatic) methods for enhancing the content. Right now, to deploy a new ontology to the Earth Science Ontology repository, 14 manual steps are involved (see the details of the 14 steps here). In order to speed up the process, it is necessary to explore possibilities for automation. In this project, robust automatic upload processes will be designed, tested, and deployed.
Meanwhile, for the Earth Science Ontology to be more broadly reused, it needs to conform to the principles of RESTful services, and its API needs to have easily understandable documentation for developers. The summer intern will work with the postdoctoral fellow who has created the ontology and a professor who is a leading expert in ontology environments to complete the project.