|Title||Type of Publication||Year of Publication||Authors||Journal Title||Abstract||DOI||Issue||Pagination||Volume|
|Data Discovery||Book Chapter||2018||W.K. Michener||Ecological Informatics. Data Management and Knowledge Discovery||10.1007/978-3-319-59928-1||115-128|
|Quality assurance and quality control (QA/QC)||Book Chapter||2018||W.K. Michener||Ecological Informatics. Data Management and Knowledge Discovery||10.1007/978-3-319-59928-1||55-70|
|Creating and managing metadata||Book Chapter||2018||W.K. Michener||Ecological Informatics. Data Management and Knowledge Discovery||10.1007/978-3-319-59928-1||71-88|
|Communicating and disseminating research findings||Book Chapter||2018||A.E. Budden; W.K. Michener||Ecological Informatics. Data Management and Knowledge Discovery||10.1007/978-3-319-59928-1||289-317|
|Data integration: principles and practice||Book Chapter||2018||M. Schildhauer||Ecological Informatics. Data Management and Knowledge Discovery||10.1007/978-3-319-59928-1||129-157|
|A Science Products Inventory for Citizen-Science Planning and Evaluation||Journal Article||2018||A. Wiggins; R. Bonney; G. LeBuhn; J.K. Parrish; J.F. Weltzin||BioScience||10.1093/bioscience/biy028||biy028|
|Eleven quick tips for finding research data||Journal Article||2018||K. Gregory; S.J. Khalsa; W.K. Michener; F.E. Psomopoulos; A. de Waard; M. Wu||PLoS Comput Bio||10.1371/journal.pcbi.1006038||4||14|
|The Bari Manifesto: An interoperability framework for essential biodiversity variables||Journal Article||2018||A.R. Hardisty; W.K. Michener; D. Agosti; E.Alonso García; L. Bastin; L. Belbin; A. Bowser; P.Luigi Buttigieg; D.A.L. Canhos; W. Egloff; R. De Giovanni; R. Figueira; Q. Groom; R.P. Guralnick; D. Hobern; W. Hugo; D. Koureas; J. Liqiang; W. Los; J. Manuel; D. Manset; J. Poelen; H. Saarenmaa; D. Schigel; P.F. Uhlir; D. Kissling||Ecological Informatics|
Essential Biodiversity Variables (EBV) are fundamental variables that can be used for assessing biodiversity change over time, for determining adherence to biodiversity policy, for monitoring progress towards sustainable development goals, and for tracking biodiversity responses to disturbances and management interventions. Data from observations or models that provide measured or estimated EBV values, which we refer to as EBV data products, can help to capture the above processes and trends and can serve as a coherent framework for documenting trends in biodiversity. Using primary biodiversity records and other raw data as sources to produce EBV data products depends on cooperation and interoperability among multiple stakeholders, including those collecting and mobilising data for EBVs and those producing, publishing and preserving EBV data products. Here, we encapsulate ten principles for the current best practice in EBV-focused biodiversity informatics as ‘The Bari Manifesto’, serving as implementation guidelines for data and research infrastructure providers to support the emerging EBV operational framework based on trans-national and cross-infrastructure scientific workflows. The principles provide guidance on how to contribute towards the production of EBV data products that are globally oriented, while remaining appropriate to the producer's own mission, vision and goals. These ten principles cover: data management planning; data structure; metadata; services; data quality; workflows; provenance; ontologies/vocabularies; data preservation; and accessibility. For each principle, desired outcomes and goals have been formulated. Some specific actions related to fulfilling the Bari Manifesto principles are highlighted in the context of each of four groups of organizations contributing to enabling data interoperability - data standards bodies, research data infrastructures, the pertinent research communities, and funders. The Bari Manifesto provides a roadmap enabling support for routine generation of EBV data products, and increases the likelihood of success for a global EBV framework.
|Facilitating and Improving Environmental Research Data Repository Interoperability||Journal Article||2018||C. Gries; A. Budden; C. Laney; M. O'Brien; M. Servilla; W. Sheldon; K. Vanderbilt; D. Vieglais||Data Science Journal||10.5334/dsj-2018-022||17|
|Research Data Sharing: Practices and Attitudes of Geophysicists||Journal Article||2018||C. Tenopir; L. Christian; S. Allard; J. Borycz||Earth and Space Science|
Abstract Open data policies have been introduced by governments, funders, and publishers over the past decade. Previous research showed a growing recognition by scientists of the benefits of data-sharing and reuse, but actual practices lag and are not always compliant with new regulations. The goal of this study is to investigate motives, attitudes, and data practices of the community of earth and planetary geophysicists, a discipline believed to have accepting attitudes towards data sharing and reuse. A better understanding of the attitudes and current data-sharing practices of this scientific community could enable funders, publishers, data managers, and librarians to design systems and services that help scientists understand and adhere to mandates and to create practices, tools, and services that are scientist-focused. An online survey was distributed to the members of the American Geophysical Union (AGU), producing 1372 responses from 116 countries. The attitudes of researchers to data sharing and reuse were generally positive, but in practice scientists had concerns about sharing their own research data. These concerns include the possibility of potential data misuse and the need for assurance of proper citation and acknowledgement. Training and assistance in good data management practices are lacking in many scientific fields and might help to alleviate these doubts.
|Using Peer Review to Support Development of Community Resources for Research Data Management||Journal Article||2017||H. Soyka; A. Budden; V. Hutchison; D. Bloom; J. Duckles; A. Hodge; M.S. Mayernik; T. Poisot; S. Rauch; G. Steinhart; L. Wasser; A.L. Whitmire; S. Wright||Journal of eScience Librarianship||https://doi.org/10.7191/jeslib.2017.1114||2||6|
|The influence of community recommendations on metadata completeness||Journal Article||2017||S. Gordon; T. Habermann||Ecological Informatics|
AbstractMany communities use standard, structured documentation that is machine-readable, i.e. metadata, to make discovery, access, use, and understanding of scientific datasets possible. Organizations and communities have also developed recommendations for metadata content that is required or suggested for their data developers and users. These recommendations are typically specific to metadata representations (dialects) used by the community. By considering the conceptual content of the recommendations, quantitative analysis and comparison of the completeness of multiple metadata dialects becomes possible. This is a study of completeness of EML and CSDGM metadata records from DataONE in terms of the LTER recommendation for Completeness. The goal of the study is to quantitatively measure completeness of metadata records and to determine if metadata developed by LTER is more complete with respect to the recommendation than other collections in EML and in CSDGM. We conclude that the LTER records are broadly more complete than the other EML collections, but similar in completeness to the CSDGM collections.
|Attitudes and norms affecting scientists’ data reuse||Journal Article||2017||R.Gonçalves Curty; K. Crowston; A. Specht; B.W. Grant; E.D. Dalton||PLOS ONE|
The value of sharing scientific research data is widely appreciated, but factors that hinder or prompt the reuse of data remain poorly understood. Using the Theory of Reasoned Action, we test the relationship between the beliefs and attitudes of scientists towards data reuse, and their self-reported data reuse behaviour. To do so, we used existing responses to selected questions from a worldwide survey of scientists developed and administered by the DataONE Usability and Assessment Working Group (thus practicing data reuse ourselves). Results show that the perceived efficacy and efficiency of data reuse are strong predictors of reuse behaviour, and that the perceived importance of data reuse corresponds to greater reuse. Expressed lack of trust in existing data and perceived norms against data reuse were not found to be major impediments for reuse contrary to our expectations. We found that reported use of models and remotely-sensed data was associated with greater reuse. The results suggest that data reuse would be encouraged and normalized by demonstration of its value. We offer some theoretical and practical suggestions that could help to legitimize investment and policies in favor of data sharing.
|DataONE: A Data Federation with Provenance Support||Book Chapter||2016||Y. Cao; C. Jones; V. Cuevas-Vicenttín; M.B. Jones; B. Ludäscher; T. McPhillips; P. Missier; C. Schwalm; P. Slaughter; D. Vieglais; L. Walker; Y. Wei||Provenance and Annotation of Data and Processes: 6th International Provenance and Annotation Workshop, IPAW 2016, McLean, VA, USA, June 7-8, 2016, Proceedings||230 - 234|
|Climate and Sustainability| Dominant Visual Frames in Climate Change News Stories: Implications for Formative Evaluation in Climate Change Campaigns||Journal Article||2016||S. Rebich-Hespanha; R.E. Rice||International Journal of Communication||10|
|Understanding Scientific Data Sharing Outside of the Academy||Conference Paper||2016||D. Pollock||Proceedings of the 79th ASIS&T Annual Meeting: Creating Knowledge, Enhancing Lives Through Information & Technology|
|Research Data Services in European and North American Libraries: Current Offerings and Plans for the Future||Conference Paper||2016||C. Tenopir; D. Pollock; S. Allard; D. Hughes||Proceedings of the 79th ASIS&T Annual Meeting: Creating Knowledge, Enhancing Lives Through Information & Technology|
|Computational provenance: DataONE and implications for cultural heritage institutions||Conference Paper||2016||R.J. Sandusky||2016 IEEE International Conference on Big Data (Big Data)||10.1109/BigData.2016.7840984|
|Provenance Storage, Querying, and Visualization in PBase||Book Chapter||2015||V. Cuevas-Vicenttín; P. Kianmajd; B. Ludäscher; P. Missier; F. Chirigati; Y. Wei; D. Koop; S. Dey||Provenance and Annotation of Data and Processes||10.1007/978-3-319-16462-5||239-241|
|Provenance-Based Searching and Ranking for Scientific Workflows||Book Chapter||2015||V. Cuevas-Vicenttín; B. Ludäscher; P. Missier||Provenance and Annotation of Data and Processes||10.1007/978-3-319-16462-5_17||209-214|
|Make Data Count - Unit 1 Final Report||Journal Article||2015||P.L.O.S. ALM; C. Strasser; J. Kratz; J. Lin|
|Perceived discontinuities and continuities in transdisciplinary scientific working groups||Journal Article||2015||K. Crowston; A. Specht; C. Hoover; K.M. Chudoba; M.Beth Watson-Manheim||Science of The Total Environment||10.1016/j.scitotenv.2015.04.121|
|Ecological data sharing||Journal Article||2015||W.K. Michener||Journal of Ecological Informatics||doi:10.1016/j.ecoinf.2015.06.010||29|
|Changes in Data Sharing and Data Reuse Practices and Perceptions among Scientists Worldwide||Journal Article||2015||C. Tenopir; E.D. Dalton; S. Allard; M. Frame; I. Pjesivac; B. Birch; D. Pollock; K. Dorsett||PLoS ONE||10.1371/journal.pone.0134826||8||10|
|CitSci.org: A New Model for Managing, Documenting, and Sharing Citizen Science Data||Journal Article||2015||Y. Wang; N. Kaplan; G. Newman; R. Scarpino||PLoS BiolPLoS Biol|
<p>This Community Page proposes a platform to support effective metadata documentation with a view to improving the discoverability and reusability of citizen-gathered science data.</p>
|“Personas” to Support Development of Cyberinfrastructure for Scientific Data Sharing.||Journal Article||2015||K. Crowston||Journal of eScience Librarianship||10.7191/jeslib.2015.1082||2||4|
|Ten Simple Rules for Creating a Good Data Management Plan||Journal Article||2015||W.K. Michener||PLoS Comput Biol||10.1371/journal.pcbi.1004525||10||11|
|Research Data Services in Academic Libraries: Data Intensive Roles for the Future?||Journal Article||2015||C. Tenopir; D. Hughes; S. Allard; M. Frame; B. Birch; L. Baird; R. Sandusky; M. Langseth; A. Lundeen||Journal of eScience Librarianship||10.7191/jeslib.2015.1085||2||4|
|YesWorkflow: A User-Oriented, Language-Independent Tool for Recovering Workflow Information from Scripts||Journal Article||2015||T. McPhillips; T. Song; T. Kolisnik; S. Aulenbach; K. Belhajjame; K. Bocinsky; Y. Cao; J. Cheney; F. Chirigati; S. Dey; J. Freire; C. Jones; J. Hanken; K.W. Kintigh; T.A. Kohler; D. Koop; J.A. Macklin; P. Missier; M. Schildhauer; C. Schwalm; Y. Wei; M. Bieda; B. Ludäscher||International Journal of Digital Curation||298–313||10|
|Correction: CitSci. org: A New Model for Managing, Documenting, and Sharing Citizen Science Data||Journal Article||2015||Y. Wang; N. Kaplan; G. Newman; R. Scarpino||PLoS biology||10.1371/journal.pbio.1002343||13|
|Making data count||Journal Article||2015||J.E. Kratz; C. Strasser||Scientific Data||10.1038/sdata.2015.39||150039 -||2|
|The Tao of Open Science for Ecology||Journal Article||2015||S.E. Hampton; S. Anderson; S.C. Bagby; C. Gries; X. Han; E. Hart; M.B. Jones; C. Lenhardt; A. MacDonald; W. Michener; J.F. Mudge; P. A; M. Schildhauer; K.H. Woo; N. Zimmerman||Ecosphere||http://dx.doi.org/10.1890/ES14-00402.1||7||6|
|Computing Location-Based Lineage from Workflow Specifications to Optimize Provenance Queries||Book Chapter||2015||S. Dey; S. Köhler; S. Bowers; B. Ludäscher||Provenance and Annotation of Data and Processes||10.1007/978-3-319-16462-5_14||180-193||8628|
|The Backstage Work of Data Sharing||Conference Paper||2014||K. Kervin; R.B. Cook; W.K. Michener||Proceedings of the 18th International Conference on Supporting Group Work||10.1145/2660398.2660406|
|Realizing the Value of a National Asset: Scientific Data||Journal Article||2014||A. Wilson; R.R. Downs; C. Lenhardt; C. Meyer; W. Michener; H. Ramapriyan; E. Robinson||10.1002/2014EO500006||477–478||95|
|Managing scientific data as public assets: Data sharing practices and policies among full-time government employees||Journal Article||2014||K. Douglass; S. Allard; C. Tenopir; L. Wu; M. Frame||10.1002/asi.22988||251–262||65|
|SWEET ontology coverage for earth system sciences||Journal Article||2014||N. DiGiuseppe; L.C. Pouchard; N.F. Noy||10.1007/s12145-013-0143-1||1-16|
|Research data management services in academic research libraries and perceptions of librarians||Journal Article||2014||C. Tenopir; R.J. Sandusky; S. Allard; B. Birch||Library & Information Science Research||doi:10.1016/j.lisr.2013.11.003||2||36|
|Evaluating a Complex Project||Book Chapter||2014||S. Allard||Research Data Management: Practical Strategies for Information Professionals||255|
|Examining data sharing and data reuse in the dataone environment||Journal Article||2014||A.P. Murillo||Proceedings of the American Society for Information Science and Technology||10.1002/meet.2014.14505101155||1–5||51|
|Data narratives: Increasing scholarly value||Journal Article||2014||L. Pouchard; A. Barton; L. Zilinski||Proceedings of the American Society for Information Science and Technology||10.1002/meet.2014.14505101088||1–4||51|
|The PBase scientific workflow provenance repository||Journal Article||2014||V. Cuevas-Vicenttín; P. Kianmajd; B. Ludäscher; P. Missier; F. Chirigati; Y. Wei; D. Koop; S. Dey||International Journal of Digital Curation||10.2218/ijdc.v9i2.332||28–38||9|
|DMPTool 2: Expanding Functionality for Better Data Management Planning||Journal Article||2014||C. Strasser; S. Abrams; P. Cruse||International Journal of Digital Curation||10.2218/ijdc.v9i1.319||324–330||9|
|Next Steps for Citizen Science||Journal Article||2014||R. Bonney; J.L. Shirk; T.B. Phillips; A. Wiggins; H.L. Ballard; A.J. Miller-Rushing; J.K. Parrish||Science||10.1126/science.1251554||1436–1437||343|
|Dmptool: Guidance and Resources for Your Data Management Plan; https://dmp.cdlib.org/||Journal Article||2014||M. Mallery||Technical Services Quarterly||10.1080/07317131.2014.875394||197-199||31|
|Constructing the Role of School Librarians in the 21st Century Workforce: Implications of NSF-Funded DataONE for K-12 Librarianship||Journal Article||2014||K. Douglass; D. Bilal||iConference 2014 Proceedings|
|DataUp: A tool to help researchers describe and share tabular data||Journal Article||2014||C. Strasser; J. Kunze; S. Abrams; P. Cruse||F1000Research|
Scientific datasets have immeasurable value, but they lose their value over time without proper documentation, long-term storage, and easy discovery and access. Across disciplines as diverse as astronomy, demography, archeology, and ecology, large numbers of small heterogeneous datasets (i.e., the long tail of data) are especially at risk unless they are properly documented, saved, and shared. One unifying factor for many of these at-risk datasets is that they reside in spreadsheets. In response to this need, the California Digital Library (CDL) partnered with Microsoft Research Connections and the Gordon and Betty Moore Foundation to create the DataUp data management tool for Microsoft Excel. Many researchers creating these small, heterogeneous datasets use Excel at some point in their data collection and analysis workflow, so we were interested in developing a data management tool that fits easily into those work flows and minimizes the learning curve for researchers. The DataUp project began in August 2011. We first formally assessed the needs of researchers by conducting surveys and interviews of our target research groups: earth, environmental, and ecological scientists. We found that, on average, researchers had very poor data management practices, were not aware of data centers or metadata standards, and did not understand the benefits of data management or sharing. Based on our survey results, we composed a list of desirable components and requirements and solicited feedback from the community to prioritize potential features of the DataUp tool. These requirements were then relayed to the software developers, and DataUp was successfully launched in October 2012.
|SemantEco: A semantically powered modular architecture for integrating distributed environmental and ecological data||Journal Article||2014||E.W. Patton; P. Seyed; P. Wang; L. Fu; J. Dein; S. Bristol; D.L. McGuinness||Future Generation Computer Systems|
Abstract We aim to inform the development of decision support tools for resource managers who need to examine large complex ecosystems and make recommendations in the face of many tradeoffs and conflicting drivers. We take a semantic technology approach, leveraging background ontologies and the growing body of linked open data. In previous work, we designed and implemented a semantically enabled environmental monitoring framework called SemantEco and used it to build a water quality portal named SemantAqua. Our previous system included foundational ontologies to support environmental regulation violations and relevant human health effects. In this work, we discuss SemantEco’s new architecture that supports modular extensions and makes it easier to support additional domains. Our enhanced framework includes foundational ontologies to support modeling of wildlife observation and wildlife health impacts, thereby enabling deeper and broader support for more holistically examining the effects of environmental pollution on ecosystems. We conclude with a discussion of how, through the application of semantic technologies, modular designs will make it easier for resource managers to bring in new sources of data to support more complex use cases.
|http://dx.doi.org/10.1016/j.future.2013.09.017||430 - 440||36|
|UV-CDAT: Analyzing Climate Datasets from a User's Perspective||Magazine Article||2013||E. Santos||Computing in Science and Engineering||1||94 - 103||15|
|Ontological Empowerment: Sustainability via Ownership||Conference Paper||2013||J. Greenberg; A. Murillo; J.K. Kunze|
Positive impacts associated with urban housing/home ownership programs motivate us to study this topic in relation to ontologies. This paper reviews ontological dependence and presents early work underway in the DataONE Preservation and Metadata Working Group (PAMWG) to collectively leverage existing metadata schemes and ontologies. The paper introduces a high-level set of functional requirements and the stackoverflow model that may be used detect highly rated metadata or ontological properties to from a loose cannon for describing scientific data. The long term goal is to establish community identity and rhythm supporting a sustainable ontology/metadata driven workflow.
|Big data and the future of ecology||Journal Article||2013||S.E. Hampton; C.A. Strasser; J.J. Tewksbury; W.K. Gram; A.E. Budden; A.L. Batcheller; C.S. Duke; J.H. Porter||10.1890/120103||3||156 - 162||11|
|A Linked Science investigation: enhancing climate change data discovery with semantic technologies||Journal||2013||L.C. Pouchard; M.L. Branstetter; R. Cook; R. Devarakonda; J. Green; G. Palanisamy; P. Alexander; N.F. Noy||Earth Science Informatics|
Linked Science is the practice of inter-connecting scientific assets by publishing, sharing and linking scientific data and processes in end-to-end loosely coupled workflows that allow the sharing and re-use of scientific data. Much of this data does not live in the cloud or on the Web, but rather in multi-institutional data centers that provide tools and add value through quality assurance, validation, curation, dissemination, and analysis of the data. In this paper, we make the case for the use of scientific scenarios in Linked Science. We propose a scenario in river-channel transport that requires biogeochemical experimental data and global climate-simulation model data from many sources. We focus on the use of ontologies—formal machine-readable descriptions of the domain—to facilitate search and discovery of this data. Mercury, developed at Oak Ridge National Laboratory, is a tool for distributed metadata harvesting, search and retrieval. Mercury currently provides uniform access to more than 100,000 metadata records; 30,000 scientists use it each month. We augmented search in Mercury with ontologies, such as the ontologies in the Semantic Web for Earth and Environmental Terminology (SWEET) collection by prototyping a component that provides access to the ontology terms from Mercury. We evaluate the coverage of SWEET for the ORNL Distributed Active Archive Center (ORNL DAAC).
|Academic librarians and research data services: Preparation and attitudes||Journal Article||2013||C. Tenopir; R.J. Sandusky; S. Allard; B. Birch||IFLA Journal||http://dx.doi.org/10.1177/0340035212473089||1||39|
|NSF DataNet: Curating Scientific Data||Journal Article||2013||J. Kunze; S. Choudhury|
|Automatic Tag Recommendation for Metadata Annotation Using Probabilistic Topic Modeling||Conference Paper||2013||S. Tuarob; L.C. Pouchard; L. Giles||Proceedings of the 13th ACM/IEEE-CS Joint Conference on Digital Libraries||10.1145/2467696.2467706|
|Data reuse and scholarly reward: understanding practice and building infrastructure||Journal Article||2013||T.J. Vision; H.A. Piwowar||PeerJ PrePrints|
|D-PROV: extending the PROV provenance model with workflow structure||Conference Paper||2013||P. Missier; S.C. Dey; K. Belhajjame; V. Cuevas-Vicenttín; B. Ludäscher||TaPP|
|The DMPTool and DataUp: Helping Researchers Manage, Archive, and Share their Data||Conference Paper||2013||C. Strasser; P. Cruse||Research Data Management Implementations Workshop|
|Participatory design of DataONE—Enabling cyberinfrastructure for the biological and environmental sciences||Journal Article||2012||W.K. Michener; S. Allard; A.E. Budden; R. Cook; K. Douglass; M. Frame; S. Kelling; R.J. Koskela; C. Tenopir; D.A. Vieglais||Ecological Informatics||10.1016/j.ecoinf.2011.08.007||Sep 2012||11|
|Data-intensive science applied to broad-scale citizen science||Journal Article||2012||W.M. Hochachka; D. Fink; R.A. Hutchinson; D. Sheldon; W.K. Wong; S. Kelling||Trends in Ecology & Evolution||10.1016/j.tree.2011.11.006|
|Ecoinformatics: supporting ecology as a data-intensive science||Journal||2012||W.K. Michener; M.B. Jones||Trends in Ecology & Evolution|
Ecology is evolving rapidly and increasingly changing into a more open, accountable, interdisciplinary, collaborative and data-intensive science. Discovering, integrating and analyzing massive amounts of heterogeneous data are central to ecology as researchers address complex questions at scales from the gene to the biosphere. Ecoinformatics offers tools and approaches for managing ecological data and transforming the data into information and knowledge. Here, we review the state-of-the-art and recent advances in ecoinformatics that can benefit ecologists and environmental scientists as they tackle increasingly challenging questions that require voluminous amounts of data across disciplines and scales of space and time. We also highlight the challenges and opportunities that remain.
|DataONE: Facilitating eScience through Collaboration||Journal Article||2012||S. Allard||Journal of eScience Librarianship|
Objective: To introduce DataONE, a multiinstitutional, multinational, and interdisciplinary collaboration that is developing the cyberinfrastructure and organizational structure to support the full information lifecycle of biological, ecological,
|Ecological data in the Information Age||Journal Article||2012||S.E. Hampton; J.J. Tewksbury; C.A. Strasser||Frontiers in Ecology and the Environment||10.1890/1540-9295-10.2.59||2||59 - 59||10|
|Golden Trail: Retrieving the Data History that Matters from a Comprehensive Provenance Repository||Journal Article||2012||P. Missier; B. Ludäscher; S. Dey; M. Wang; T. McPhillips; S. Bowers; M. Agun; I. Altintas||International Journal of Digital Curation||10.2218/ijdc.v7i1.221||1||7|
|DataONE: A Distributed Environmental and Earth Science Data Network Supporting the Full Data Life Cycle||Conference Paper||2012||R. Cook; W.K. Michener; D.A. Vieglais; A.E. Budden; R.J. Koskela||EGU General Assembly Conference Abstracts|
|The future of citizen science: emerging technologies and shifting paradigms||Journal Article||2012||G. Newman; A. Wiggins; A. Crall; E. Graham; S. Newman; K. Crowston||Frontiers in Ecology and the Environment||6||298 - 304||10|
|Exploring the Motive for Data Publication in Open Data Initiative: Linking Intention to Action||Conference Paper||2012||D.S. Sayogo; T.A. Pardo||2012 45th Hawaii International Conference on System Sciences (HICSS)||10.1109/HICSS.2012.271|
|Exploring the determinants of scientific data sharing: Understanding the motivation to publish research data||Journal Article||2012||D.S. Sayogo; T.A. Pardo||Government Information Quarterly||10.1016/j.giq.2012.06.011|
|Trends in Use of Scientific Workflows: Insights from a Public Repository and Recommendations for Best Practice||Journal Article||2012||R. Littauer; K. Ram; B. Ludäscher; W.K. Michener; R.J. Koskela||International Journal of Digital Curation||10.2218/ijdc.v7i2.232||2||7|
|The fractured lab notebook: undergraduates and ecological data management training in the United States||Journal Article||2012||C.A. Strasser; S.E. Hampton||Ecosphere||10.1890/ES12-00139.1||12||art116||3|
|Citizen science comes of age||Journal Article||2012||S. Henderson||Frontiers in Ecology and the Environment||10.1890/1540-9295-10.6.283||6||283 - 283||10|
|The history of public participation in ecological research||Journal Article||2012||A. Miller-Rushing; R. Primack; R. Bonney||Frontiers in Ecology and the Environment||10.1890/110278||6||285 - 290||10|
|The current state of citizen science as a tool for ecological research and public engagement||Journal Article||2012||J.L. Dickinson; J. Shirk; D. Bonter; R. Bonney; R.L. Crain; J. Martin; T. Phillips; K. Purcell||Frontiers in Ecology and the Environment||10.1890/110236||6||291 - 297||10|
|Insects and plants: engaging undergraduates in authentic research through citizen science||Journal Article||2012||K. Oberhauser; G. LeBuhn||Frontiers in Ecology and the Environment||10.1890/110274||6||318 - 320||10|
|From Caprio's lilacs to the USA National Phenology Network||Journal Article||2012||M.D. Schwartz; J.L. Betancourt; J.F. Weltzin||Frontiers in Ecology and the Environment||10.1890/110281||6||324 - 327||10|
|Academic libraries and research data services: Current practices and plans for the future||Report||2012||C. Tenopir; B. Birch; S. Allard||Academic libraries and research data services: Current practices and plans for the future|
|Documenting and Sharing Scientific Research over the Semantic Web||Conference Paper||2012||A. Gándara; N. Villanueva-Rosales||Proceedings of the 12th International Conference on Knowledge Management and Knowledge Technologies||10.1145/2362456.2362480|
|DataUp: Further Development and Community Building||Web Article||2012||P. Cruse; C. Strasser; W. Michener; J. Kunze; D. Vieglais||eScholarship CDL Staff Publications||2012|
|DataONE Member Node Pilot Integration with TeraGrid?||Conference Paper||2011||N.C. Dexter; J.W. Cobb; D.A. Vieglais; M.B. Jones; M. Lowe||Conference Proceedings of the 2011 TeraGrid Conference on Extreme Digital Discovery - TG '11||10.1145/201674110.1145/2016741.2016756||1|
|Challenges and Opportunities of Open Data in Ecology||Journal Article||2011||O.J. Reichman; M.B. Jones; M.P. Schildhauer||Science||10.1126/science.1197962||6018||703 - 705||331|
|Data archiving is a good investment||Journal Article||2011||H.A. Piwowar; T.J. Vision; M.C. Whitlock||Nature||10.1038/473285a||7347||285 - 285||473|
|Data Sharing by Scientists: Practices and Perceptions||Journal Article||2011||C. Tenopir; S. Allard; K. Douglass; A.U. Aydinoglu; L. Wu; E. Read; M. Manoff; M. Frame||PLoS ONE||10.1371/journal.pone.0021101||6||6|
|Emergent Filters: Automated Data Verification in a Large-Scale Citizen Science Project||Conference Paper||2011||S. Kelling; J. Yu; J. Gerbracht; W.K. Wong||2011 IEEE Seventh International Conference on e-Science Workshops (eScienceW)||10.1109/eScienceW.2011.13||20 - 27|
|Understanding the Capabilities and Critical Success Factors for Scientific Data Sharing in DataONE Collaborative Network||Conference Paper||2011||D.S. Sayogo; T.A. Pardo||Proceedings of the 12th Annual International Digital Government Research Conference: Digital Government Innovation in Challenging Times||10.1145/203755610.1145/2037556.2037568|
|Exploring the determinants of publication of scientific data in open data initiative||Conference Paper||2011||T.A. Pardo; D.S. Sayogo||Proceedings of the 5th International Conference on Theory and Practice of Electronic Governance|
This research provides a preliminary analysis of determinants of the likelihood of researchers to publish their research datasets online. The data is derived from a preliminary survey conducted as part of the DataONE project; an international federated data repository of ecological data. The survey of 1,329 researchers was conducted by the Usability and Assessment Working Group of DataONE from October 2009 to July 2010. The analysis of the data is threefold, namely: visualization of a 2-mode network, descriptive statistics, and ordered logistic regression. The visualization of the affiliated network shows a disconnected access pattern. With the majority of researchers accessing one database and only a few connecting or accessing more than one database. The results of the survey using descriptive and inferential statistics pointed at two key determinants of publishing research datasets online, namely: data management and attribution to the datasets owner. The importance of data management manifests on two ways, the significant of data management skills and organizational support for data management.
|Dataone: Data observation network for earth-preserving data and enabling innovation in the biological and environmental sciences||Journal Article||2011||W. Michener; D. Vieglais; T. Vision; J. Kunze; P. Cruse; G. Janée||D-Lib Magazine||3||17|
|A method to track dataset reuse in biomedicine: Filtered GEO accession numbers in PubMed Central||Journal Article||2010||H.A. Piwowar||Proceedings of the American Society for Information Science and Technology|
Reusing research data has important potential benefits: generative science and efficient resource use. Tracking the reuse of research datasets would allow us to understand whether the potential benefits are indeed realized, enable recognition of investigators who produce, annotate, and share useful data, and inform data sharing and reuse initiatives, tools, and policies.
Unfortunately, the lack of clear attribution practices for data make automated tracking of data reuse difficult. I present a method for tracking research data reuse that takes advantage of the community norms around gene expression microarray data sharing and the rich NCBI Entrez resources. Specifically, the full-text of papers stored in PubMed Central are queried for accession numbers of datasets archived in NCBI's Gene Expression Omnibus (GEO) repository. Studies known to have created microarray data are excluded through automated filters and guided manual curation. MeSH terms attached to the data creation and data reuse studies provide additional information for analysis. Finally, I extrapolate the findings to all of PubMed.
Automated portions of this method have been implemented in python and are openly available. Although imperfect, this dataset is a valuable initial resource for research into patterns of data reuse.
|Quantitatively Evaluating Data Citation and Sharing Policies in the Earth Sciences||Presentation||2010||N. Weber|