As an outcome of the Careers in Biocuration Workshop at the Biocuration 2018 conference, we drafted a generic position description for the biocuration profession. We welcome feedback and comments from the community.
Have you ever been called pedantic or precise? Did you take that as a compliment, not a criticism? If so, we have the job for you!
If you have a background in the biological sciences and are looking for a different career path, consider biocuration.
What is a Biocurator?
A biocurator is a scientist who extracts, interprets, synthesizes and archives data from a variety of sources, including literature, genomes, online materials and author provided biological data, to standardize this data and make it machine readable, more discoverable and accessible to the public.
What makes a good Biocurator?
The best biocurators are detail oriented, conscientious, and good communicators. They are adaptable to the needs of the community and/or to the needs of the software systems.
Typical job requirements
- Subject matter expertise – typically a PhD, although not a requirement
- Ability to collaborate and work in a team
- Defines and refines rules and standards (as data types and user requirements evolve)
- Can communicate well with computer programmers, bioinformaticians and biologists alike
- Liaise with all stakeholders of the data, from the producers/submitters to the consumers
- Demonstrate exposure, understanding or experience with natural organisms and how they are related, described and named in the context of the advertised job (e.g. botanical nomenclature for work at a herbarium, cultivar naming for work in the patents office)
- Understanding of the latest technologies for the storage of biological data
- Demonstrated experience in self training and exploring new technologies
- Demonstrated experience in the collection, storage, transformation, standardization, harmonization and analysis of legacy data stored in a variety of formats (CSV, JSON, Relational databases like MySQL, PostgreSQL, SQLServer Oracle, unstructured text, geospatial databases
- Demonstrated skills in the use of common data wrangling or Extract-Transform-Load software, programming or scripting languages (e.g. Python, pandas, R)
- Demonstrated experience in the creation of metadata for curated datasets
- An understanding of the life cycle of data and in particular the storage of data for later reuse
- An understanding of various data license types and how they impact on the use and reuse of curated data available to the public and/or researchers
- An understanding of the wide variety of data types (text, binary, genomic, image) and nuances regarding the handling and storage of this type of data
- An understanding of privacy laws and policies relevant to the state and country where the job is situated and international laws relevant to the work, funding source or final repository
Perks of the job?
(These may vary depending on the position)
- Remote working opportunities
- Positions often have flexible working hours
- Great team opportunities
- Learn new skills
- Stakeholder in developing community standards