Excellence in curation – Advanced Career Award Nominees 2022

Voting will be from 26th July to 25th August 2022

Five biocurators have been nominated for this award. As an ISB member you are invited to vote for one of the nominees described below. If you are an ISB member and did not receive an invite, please send an email to: isb@biocurator.org.

The winner of the Advanced Career Award will be awarded a prize of 500CHF and will give a 15 minute talk at a virtual ISB conference. In addition, they will have agreed to give their name, bio and photograph included on the ISB website, newsletter (circulated via the ISB distribution list) and twitter account.

Antonia Lock, European Bioinformatics Institute, Hinxton, Cambridge, UK.

Antonia worked at PomBase between 2011 and 2020 producing a staggering volume of high-quality curation (30,000 annotations/ 1500 publications) and still holds the ’curation record’. At PomBase Antonia took the lead on a wide range of projects, including i) hosting and standardisation of browser dataset metadata ii) mapping disease annotations to MONDO and extending this dataset iii) evaluating the human “unknown” protein component for PMID:30938578; iv) lead author on a publication highlighting incentives and nudges to increase community participation (PMID:32353878 ) v) 2020 Database paper PMID:32353878 vi) a popular review PMID:29761456 (3350 accesses).

She did a substantial amount of training, including numerous PomBase and fungal workshops (international), graduate workshops, and project supervisions (at UofC and UCL). Most of this was achieved whilst working part-time (alongside another curation post at Healx).

Antonia moved to UniProt full-time in 2020 and is a UniProt curator and GOA coordinator. In this role, she is responsible for quality control of UniProt GO annotation and has produced useful training materials for Gene Ontology use https://www.ebi.ac.uk/training/events/exploring-gene-ontology-go-annotations-tools-and-resources/

Antonia has no ego, she is not interested in recognition. She simply plans her work by what is most useful to enable people to do science effectively. Here, she does not take the well-trodden (easy) path but seeks out literature to curate emerging models and pathogens, novel genes, and novel data ideally for knowledge that will be informative for other species and would often lie under the radar.

  • Lock A, Harris MA, Rutherford K, Hayles J, Wood V. Community curation in PomBase: enabling fission yeast experts to provide detailed, standardized, sharable annotation from research publications. Database (Oxford). 2020 Jan 1;2020:baaa028. doi: 10.1093/database/baaa028. PMID: 32353878
  • Wood V, Lock A, Harris MA, Rutherford K, Bähler J, Oliver SG. Hidden in plain sight: what remains to be discovered in the eukaryotic proteome? Open Biol. 2019 Feb 28;9(2):180241. doi: 10.1098/rsob.180241. PMID: 30938578
  • Lock A, Rutherford K, Harris MA, Hayles J, Oliver SG, Bähler J, Wood V. PomBase 2018: user-driven reimplementation of the fission yeast database provides rapid and intuitive access to diverse, interconnected information. Nucleic Acids Res. 2019 Jan 8;47(D1):D821-D827. doi: 10.1093/nar/gky961. PMID: 30321395
  • Lock A, Rutherford K, Harris MA, Wood V. PomBase: The Scientific Resource for Fission Yeast. Methods Mol Biol. 2018;1757:49-68. doi: 10.1007/978-1-4939-7737-6_4. PMID: 29761456
  • Wood V, Carbon S, Harris MA, Lock A, Engel SR, Hill DP, Van Auken K, Attrill H, Feuermann M, Gaudet P, Lovering RC, Poux S, Rutherford KM, Mungall CJ. Term Matrix: a novel Gene Ontology annotation quality control system based on ontology term co-annotation patterns. Open Biol. 2020 Sep;10(9):200149. doi: 10.1098/rsob.200149. Epub 2020 Sep 2. PMID: 32875947

Lisa Harper, USDA-ARS, Corn Insects and Crop Genetics Research, Ames, IA, USA.

We are nominating Dr. Lisa Harper for raising the biocuration of agricultural genetics, genomics, and breeding data to a higher standard, through practice and outreach.

Dr. Harper is a skilled geneticist, curator, and outreach coordinator for the Maize Genetics and Genomics Database. Dr. Harper understands the importance of collaboration and has made it a large part of her career, bridging scientists from diverse disciplines and species. Dr. Harper’s work includes outreach and training to encourage community-level willingness to share data and ideas and support for tools and resources to facilitate this spirit of collaboration. In 2014, Dr. Harper founded the AgBioData consortium to provide a platform for collaboration, communication, and promotion of best practices for agricultural genomics, genetics, and breeding databases. Before Dr. Harper’s efforts, communication and collaboration between the 30+ databases in this field were isolated. Dr. Harper’s vision, her deep knowledge of the area of biocuration, and her effective and engaging interpersonal skills were instrumental in motivating key players of this community – spanning plants and animals – to enthusiastically collaborate via AgBioData. As a result, the AgBioData community identified key challenges and opportunities in multiple biocuration-related fields (Harper et al. 2018). Thanks to her efforts, AgBioData is now an NSF-funded, expanded network identifying and addressing these key challenges in the field of agricultural biocuration.

In summary, Dr. Lisa Harper’s diverse contributions have made a lasting impact on the field of biocuration, particularly in the agricultural domain, and deserve recognition.

  • Lisa Harper, Jacqueline Campbell, Ethalinda K S Cannon, Sook Jung, Monica Poelchau, Ramona Walls, Carson Andorf, Elizabeth Arnaud, Tanya Z Berardini, Clayton Birkett, Steve Cannon, James Carson, Bradford Condon, Laurel Cooper, Nathan Dunn, Christine G Elsik, Andrew Farmer, Stephen P Ficklin, David Grant, Emily Grau, Nic Herndon, Zhi-Liang Hu, Jodi Humann, Pankaj Jaiswal, Clement Jonquet, Marie-Angélique Laporte, Pierre Larmande, Gerard Lazo, Fiona McCarthy, Naama Menda, Christopher J Mungall, Monica C Munoz-Torres, Sushma Naithani, Rex Nelson, Daureen Nesdill, Carissa Park, James Reecy, Leonore Reiser, Lacey-Anne Sanderson, Taner Z Sen, Margaret Staton, Sabarinath Subramaniam, Marcela Karey Tello-Ruiz, Victor Unda, Deepak Unni, Liya Wang, Doreen Ware, Jill Wegrzyn, Jason Williams, Margaret Woodhouse, Jing Yu, Doreen Main, AgBioData consortium recommendations for sustainable genomics and genetics databases for agriculture, Database, Volume 2018, 2018, bay088, https://doi.org/10.1093/database/bay088

Sabrina Toro, Department of Biomedical Informatics, University of Colorado Anschutz Medical Campus, Aurora, CO, USA.

Sabrina Toro is a developmental biologist with nearly 10 years of experience as a Biocurator. As a curator for the Zebrafish Informatics Network (ZFIN) from 2013-2021, she curated zebrafish data in approximately 3,000 publications including allele and variant, gene expression, phenotype, and disease information. Sabrina participated in and led projects to ensure data curation quality and harmonization for ZFIN, Gene Ontology Consortium, and the Alliance of Genome Resources where she led the efforts of Variant harmonization.

Sabrina joined the Translational and Integrative Sciences (TISLab) at University of Colorado in 2021. She is a primary curator of the Mondo Disease Ontology, and co-created and leads the development of the Vertebrate Breed Ontology, both which are used for standardizing cross-species disease and phenotype information for the Monarch Initiative.

Sabrina is an active member of the biocuration community and currently serves on several committees including the ISB nominating, the ISB2023 Scientific Program, ICBO2022 program, and OBO Governance Task committees. She is dedicated to knowledge sharing and dissemination: she develops shared training materials for the OBO Academy, trained colleagues in data curation workflows, leads workshops on the Mondo ontology and best practices for data curation using ontologies, and is currently mentoring a post-doc in data curation and ontology development.

Sabrina’s inquisitive nature and aptitude for new knowledge and skills make her an ideal collaborator and valuable contributor to the biocurator field, especially for ensuring data consistency and standardization. Her contributions have greatly benefited the biomedical field and the advancement of precision medicine.

  • Nicole A Vasilevsky, Nicolas A Matentzoglu, Sabrina Toro, et al., Mondo: Unifying diseases for the world, by the world, doi: https://doi.org/10.1101/2022.04.13.22273750.
  • Sabrina Toro, Nicole Vasilevsky, Nico Matentzoglu, Using the Mondo Disease Ontology for Disease Data Curation, [Curating the Clinical Genome 2022- Workshop, June 2022, Virtual], https://mondo.monarchinitiative.org/pages/workshop/#june-2022.
  • Howe DG, Ramachandran S, Bradford YM, Fashena D, Toro S, Eagle A, Frazer K, Kalita P, Mani P, Martin R, Moxon ST, Paddock H, Pich C, Ruzicka L, Schaper K, Shao X, Singer A, Van Slyke CE, Westerfield M. The Zebrafish Information Network: major gene page and home page updates. Nucleic Acids Res. 2021 Jan 8;49(D1):D1058-D1064. doi: 10.1093/nar/gkaa1010. PMID: 33170210; PMCID: PMC7778988.
  • Alliance of Genome Resources Consortium. Harmonizing model organism data in the Alliance of Genome Resources. Genetics. 2022 Apr 4;220(4):iyac022. doi: 10.1093/genetics/iyac022. PMID: 35380658.
  • Gene Ontology Consortium. The Gene Ontology resource: enriching a GOld mine. Nucleic Acids Res. 2021 Jan 8;49(D1):D325-D334. doi: 10.1093/nar/gkaa1113. PMID: 33290552.

Shur-Jen Wang, Rat Genome Database, Medical College of Wisconsin, Milwaukee, WI, USA.

Dr. Shur-Jen Wang has made considerable contributions to the Rat Genome Database (RGD) and to the model organism database community for a significant portion of her scientific career. Dr. Wang joined RGD in 2009, began in curation, and currently is involved in many facets of the ever-growing RGD resource. One primary focus continues to be rat strain curation and nomenclature.

In the years that Dr. Wang has been a curator at RGD, the database increased from <2400 rat strains to >4100, with report pages for collected information. Strains are curated from literature or submitted by researchers, and curated for source, derivation, and other data. Through collaborative efforts with other global biodata resources, Dr. Wang has been integral to establishing standardized nomenclature, ontologies and data harmonization (including Alliance of Genome Resources, Gene Ontology Consortium).

Dr. Wang was closely involved in the development of RGD’s curator training process and comprehensive curation manual to ensure that all curators follow rigorous standards. She mentors newer curators and takes part in monthly sessions in which papers are discussed to ensure curation consistency within RGD.

Additional efforts include development and testing of innovative data interactivity and analysis tools, such as RGD’s curated phenotype data warehouse, PhenoMiner. Dr. Wang has presented data and tools on behalf of RGD at many domestic and international meetings.

Simply put, RGD would not be a premier model organism knowledgebase without the considerable work of Dr. Shur-Jen Wang and she is an ideal candidate for this award.

  • Alliance of Genome Resources Consortium. Harmonizing model organism data in the Alliance of Genome Resources. Genetics. 2022 Apr 4;220(4):iyac022. PMID: 35380658.
  • Kaldunski ML, Smith JR, Hayman GT, Brodie K, De Pons JL, Demos WM, Gibson AC, Hill ML, Hoffman MJ, Lamers L, Laulederkind SJF, Nalabolu HS, Thorat K, Thota J, Tutaj M, Tutaj MA, Vedi M, Wang SJ , Zacher S, Dwinell MR, Kwitek AE. The Rat Genome Database (RGD) facilitates genomic and phenotypic data integration across multiple species for biomedical research. Mamm Genome. 2022 Mar;33(1):66-80. PMID: 34741192.
  • Smith JR, Hayman GT, Wang SJ , Laulederkind SJF, Hoffman MJ, Kaldunski ML, Tutaj M, Thota J, Nalabolu HS, Ellanki SLR, Tutaj MA, De Pons JL, Kwitek AE, Dwinell MR, Shimoyama ME. The Year of the Rat: The Rat Genome Database at 20: a multi-species knowledgebase and analysis platform. Nucleic Acids Res. 2020 Jan 8;48(D1):D731-D742. PMID: 31713623.
  • Wang SJ, Laulederkind SJF, Zhao Y, Hayman GT, Smith JR, Tutaj M, Thota J, Tutaj MA, Hoffman MJ, Bolton ER, De Pons J, Dwinell MR, Shimoyama M. Integrated curation and data mining for disease and phenotype models at the Rat Genome Database. Database (Oxford). 2019 Jan 1;2019:baz014. PMID: 30753478.
  • Shimoyama M, Hayman GT, Laulederkind SJ, Nigam R, Lowry TF, Petri V, Smith JR, Wang SJ, Munzenmaier DH, Dwinell MR, Twigger SN, Jacob HJ; RGD Team. The rat genome database curators: who, what, where, why. PLoS Comput Biol. 2009 Nov;5(11):e1000582. PMID: 19956751.

Gloria I. Giraldo-Calderon, University of Notre Dame, IN, USA.

Since the year 2012, Dr. Giraldo-Calderón is one of several VectorBase.org staff members facilitating the biocuration of worldwide arthropod vector data. She selects the data sets, with input from the users when provided, and curates the data and metadata. Once loaded by other team members, she runs the initial quality assurance and contacts the data generators to gather feedback about data representation or any issues.

She also assists and mentors data donors, and instructs other end users on new features and datasets in VectorBase (both in English and Spanish), in workshops, scientific meetings, webinars, and by email. This year she also created and coordinated the Twitter #biocuration week 2022, with support from the curation team, to promote biocuration awareness among VectorBase and other VEuPathDB users. As in the bioinformatics undergrad and grad courses that she teaches, VectorBase users also learn from her that datasets can be (re-)used to generate new research hypotheses and, the importance of open access and well-curated data to achieve this goal.

Dr. Giraldo-Calderón has also been the author of papers in which she has gene model curation work (Giraldo-Calderón et al 2017) and provided guidelines and standards to members of the scientific community also performing biocuration work (Giraldo-Calderón et al 2022, Amos et al 2022, Rund et al 2019, Dugan et al 2014). In the year 2019 VectorBase merged with EuPathDB to create VEuPathDB (a resource for fungi, parasites, and arthropods), and since then Dr. Giraldo-Calderón has achieved the above-mentioned VectorBase activities while working part-time.

  • Giraldo-Calderón, G. I., Harb, O. S., Kelly, S. A., Rund, S. S., Roos, D. S., & McDowell, M. A. (2022). VectorBase.org updates: bioinformatic resources for invertebrate vectors of human pathogens and related organisms. Current opinion in insect science, 50, 100860. https://doi.org/10.1016/j.cois.2021.11.008
  • Amos, B., Aurrecoechea, C., Barba, M., Barreto, A., Basenko, E. Y., Bażant, W., Belnap, R., Blevins, A. S., Böhme, U., Brestelli, J., Brunk, B. P., Caddick, M., Callan, D., Campbell, L., Christensen, M. B., Christophides, G. K., Crouch, K., Davis, K., DeBarry, J., Doherty, R., … Zheng, J. (2022). VEuPathDB: the eukaryotic pathogen, vector and host bioinformatics resource center. Nucleic acids research, 50(D1), D898–D911. https://doi.org/10.1093/nar/gkab929
  • Rund, S., Braak, K., Cator, L., Copas, K., Emrich, S. J., Giraldo-Calderón, G. I., Johansson, M. A., Heydari, N., Hobern, D., Kelly, S. A., Lawson, D., Lord, C., MacCallum, R. M., Roche, D. G., Ryan, S. J., Schigel, D., Vandegrift, K., Watts, M., Zaspel, J. M., & Pawar, S. (2019). MIReAD, a minimum information standard for reporting arthropod abundance data. Scientific data, 6(1), 40. https://doi.org/10.1038/s41597-019-0042-5
  • Giraldo-Calderón, G. I., Zanis, M. J., & Hill, C. A. (2017). Retention of duplicated long-wavelength opsins in mosquito lineages by positive selection and differential expression. BMC evolutionary biology, 17(1), 84. https://doi.org/10.1186/s12862-017-0910-6
  • Dugan, V. G., Emrich, S. J., Giraldo-Calderón, G. I., Harb, O. S., Newman, R. M., Pickett, B. E., Schriml, L. M., Stockwell, T. B., Stoeckert, C. J., Jr, Sullivan, D. E., Singh, I., Ward, D. V., Yao, A., Zheng, J., Barrett, T., Birren, B., Brinkac, L., Bruno, V. M., Caler, E., Chapman, S., … Scheuermann, R. H. (2014). Standardized metadata for human pathogen/vector genomic sequences. PloS one, 9(6), e99979. https://doi.org/10.1371/journal.pone.0099979