Enabling Data Annotation: Integrating User Management into the DataONE Metadata Environment

Kate Chastain
Katie just finished her second year in the Information Technology and Web Science Masters program at Rensselaer Polytechnic in upstate New York. Her upbringing in northern Virginia, just outside of Washington, DC, meant that she had seen snow prior to this winter, but New York was still quite a change weather-wise from where she completed her undergraduate work, in St. Petersburg, Florida. There, at Eckerd College, she earned a degree in Computer Science, as well as in East Asian Studies, and Modern Languages. In her free time, she enjoys good science fiction; bad puns; watching hockey and baseball; and embarking on cooking adventures.

Project Description: 

This project aims to make data ingest and annotation easy for a wide range of users. A Semantic Annotator was prototyped last summer to provide vision for how this can work in the DataONE environment. That tool enables earth and environmental scientists to annotate their data and link their data to relevant ontology concepts. However, users have to complete the whole process in one continuous session, and there are no security policies to protect their privacy. One way to solve the problem is to integrate user management into the Semantic Annotator to help preserve annotation results for re-use between sessions. This will make it easier to annotate data at the appropriate time and also allow multiple people to collaborate on the task. It also preserves provenance so that the systems may maintain a record of who made updates and when. The scope of this task includes:
<li>enabling user account functionality, whereby users have to register in order to have permission to use the application. Usage history data will be stored for later reuse.
<li>enabling the loading of user-specified enhancement parameters to link into existing datasets in order to reduce repetitive annotation, which will be very useful when processing datasets with same or similar metadata.
<li>enabling “semantic palette"/"my favorites" facets to preserve user's frequently used classes or properties for re-use between sessions, which will improve the efficiency of users’ annotation.
<li>implementing data access/security, so that users can control who sees the data.

Primary Mentor: 
Deborah McGuiness
Secondary Mentor: 
Xixi Luo