Needs and expectations of DataONE tools:
Jean is wary of going exclusively digital with his phenotype data because of the horror stories he’s heard from other colleagues who have lost lots of work. However, he does transcribe data from paper to an Excel sheet. He keeps the paper copy and sometimes refers back to it to jog his memory. He uses basic visualizations within Excel to verify the accuracy of data transfer and to correct (or verify) any outliers; if these functions were easier to perform using DataONE tools, he might be convinced that digitizing his data for deposition at a DataONE member node is worthwhile.
He does have concerns about the long-term preservation of his data, as there are currently no formal process in place for long-term data management. He might be interested in depositing data at a member node for preservation (e.g., migration of data formats), though only if doing so were as easy as (or at least, not much harder than) maintaining local backups.
Intellectual and physical skills that can be applied:
Jean is quite focused on his own research and has not historically involved many colleagues in collaborative work outside of his particular area of specialization. As such, he does not see the rationale for common data management protocols and believes that his data are only likely to be of interest to a very select number of researchers, most of whom he knows personally. That said, he is interested in being able to do perform longitudinal and synthetic analyses of his own work, something which is currently impossible due to the shifting standards applied to genomic data. This issue is interesting enough to Jean that he would be likely to contribute his expertise and sample data for the purpose of developing ontologies that actually meet his needs and could be supported for subsequent use in DataONE.
Technical support available:
Jean funds good technical support within his research group. He knows that data management and archiving is becoming a more important issue for his field, and he is willing to devote resources to doing a better job of it, despite his concerns about the ultimate utility to his own work.
Personal biases about data sharing and reuse (and data management more generally):
Jean does not normally share his pedigree book because it would not make sense to others, but freely distributes seeds to colleagues that ask for them. He considers these seeds to be data. When he receives seeds from others he “vets” the data by germinating the seeds and confirming the phenotype. He has hired a web developer to help visualize some of the collected data.
The assembled genomes he is willing to share immediately and thinks others should do the same. The transcriptome data are used to answer a biological question and thus are more sensitive. He would be willing to share the raw transcriptome data after publication, but does not want to be scooped in publications or proposals.
Repositories exist for genome data (e.g., GenBank), but not for raw phenotypic or raw sequence reads. Jean uses standard gene nomenclature to describe mutants, but feels unqualified to handle metadata.