EXCELLENCE IN BIOCURATION EARLY CAREER AWARD 2023

The Early Career Award recognizes biocurators who have been involved in a biocuration-relevant field for less than 7 years. The nominees are in a non-leadership position and have made sustained contributions to the field of biocuration. The recipient will be required to present a 15 minute talk at a virtual Biocuration seminar and will be sent a prize of 500 CHF. The nominee does not have to be an active ISB member, as the award will include ISB membership for 1 year.

Voting will be open from 26 May – 6 June 2023

NOMINEES

The list of nominees is below. Scroll down for detailed descriptions.

  • Valerio Arnaboldi, California Institute of Technology, CA, USA
  • J. Allen Baron, Institute for Genome Sciences, University of Maryland School of Medicine, MD, USA
  • Charles Tapley Hoyt, Laboratory of Systems Pharmacology Harvard Medical School, MA, USA
  • Samuel Rund, Center for Research Computing, University of Notre Dame, IN, USA
  • Nan Zhou, Affiliated Brain Hospital of Guangzhou Medical University, Guangzhou, China

Detailed descriptions

Valerio Arnaboldi, California Institute of Technology, CA, USA

Valerio Arnaboldi joined the WormBase team in May of 2017, and since that time has made outstanding contributions to several areas of biocuration benefiting both curators and end users. Valerio’s expertise in computer science, combined with a willingness to understand deeply the needs of user communities and a marked professionalism and measured manner, have already made him a leader in our field.

Highlights of Valerio’s work include developing an algorithm for automated gene summaries used initially by WormBase and now more widely adopted by the Alliance of Genome Resources, advanced text mining applications for community curation (ACKnowledge), literature exploration (WormiCloud), fact extraction (e.g. variant curation), and the nascent Alliance literature ecosystem, development of a graphical curation tool for microPublication Biology, and interactive tools (scdefg and wormcells-vix) for interrogating single-cell RNA sequencing data.

Valerio has shown great leadership within WormBase and the Alliance by setting an example for best practices in software development, prioritizing user needs, and demonstrating the value in adapting strategies, learning new methods, and productively collaborating with other groups for the benefit of all.

Valerio actively participates in ISB meetings and workshops, including the recent Biocuration and Machine Learning workshop at the 16th Annual International Biocuration Conference, and the 12th International Biocuration Conference where he shared a poster award for the ACKnowledge project.

The biocuration community is fortunate to have in Valerio such an impactful contributor whose work consistently demonstrates the value of expert biocuration for the biomedical sciences.

  • Publications
    • Mallick R, Arnaboldi V, Davis P, Diamantakis S, Zarowiecki M, Howe K. Accelerated variant curation from scientific literature using biomedical text mining. MicroPubl Biol. 2022 Jun 1;2022:10. PMID: 35663412
    • da Veiga Beltrame E, Arnaboldi V, Sternberg PW. WormBase single-cell tools. Bioinform Adv. 2022 Mar 28;2(1):vbac018. doi: 10.1093/bioadv/vbac018. PMID: 35814290
    • Arnaboldi V, Cho J, Sternberg PW. Wormicloud: a new text summarization tool based on word clouds to explore the C. elegans literature. Database (Oxford). 2021 Mar 31;2021:baab015. doi: 10.1093/database/baab015. PMID: 33787871
    • Kishore R, Arnaboldi V, Van Slyke CE, Chan J, Nash RS, Urbano JM, Dolan ME, Engel SR, Shimoyama M, Sternberg PW, Genome Resources TAO. Automated generation of gene summaries at the Alliance of Genome Resources. Database (Oxford). 2020 Jan 1;2020:baaa037. doi: 10.1093/database/baaa037. PMID: 32559296
    • Arnaboldi V, Raciti D, Van Auken K, Chan JN, Müller HM, Sternberg PW. Text mining meets community curation: a newly designed curation platform to improve author experience and participation at WormBase. Database (Oxford). 2020 Jan 1;2020:baaa006. doi: 10.1093/database/baaa006. PMID: 32185395

J. Allen Baron, Institute for Genome Sciences, University of Maryland School of Medicine, MD, USA

Allen started his biocuration career in 2021 at the University of Maryland School of Medicine, at the Institute for Genome Sciences (IGS) as a Human Disease Ontology (DO) biocurator. At IGS, Allen has focused on the integration of disease knowledge into the DO, contributing significantly to the project. Allen has added over 500 new disease terms definitions and cross-references to the Disease Ontology. In this work, Allen has developed novel approaches for semi-automated data collection and established productive collaborations with AI/ML developers to improve data rigor and completeness. In 2022, Allen developed novel code, called DO.utils, to enable the DO project to programmatically identify and assess project users. Allen developed this code (and an accompanying tutorial) to enable other resources to re-use the code in order to programmatically identify usage. Allen presented this work at the 2023 Biocuration meeting in Padova, Italy, where he spoke to a number other biocurators, sharing how they too could utilize this code to gain a broader understanding of their user communities. Allen has actively contributed to the DO’s Outreach through his active engagement of DO users and his participation in the OBO Foundry community and in the virtual ISB Biocuration, NIH Rare Disease Day and the Semantic MediaWiki Ontology Summit meetings since he joined the DO team in 2021. Allen mentors the DO’s January 2023 hired bioinformatics analyst, sharing programmatic and biocuration approaches for social media outreach, analysis of the geographic distribution of the DO’s user base, and graphical representation of project reporting statistics.

  • Publications
    • Baron J. A., Schriml L. M. (2023). Assessing resource use: a case study with the Human Disease Ontology. Database (Oxford). 2023 Feb 28. doi: 10.1093/database/baad007; PMID:36856688 Schriml, L. M., Munro, J. B., Schor, M., Olley, D., McCracken, C., Felix, V., Baron, J. A., Jackson, R., Bello, S. M., Bearer, C., Lichenstein, R., Bisordi, K., Dialo, N. C., Giglio, M., & Greene, C. (2022). The Human Disease Ontology 2022 update. Nucleic Acids Research, 50(D1), D1255–D1261. doi:10.1093/nar/gkab1063; PMID:34755882
    • Schriml L. M., Lichenstein R., Bisordi K., Bearer C., Baron J. A., Greene C. (2023). Modeling the enigma of complex disease etiology (2023). J Transl Med. 2023 Feb 25; 21(1):148.; PMID:36829165

Charles Tapley Hoyt, Laboratory of Systems Pharmacology Harvard Medical School, MA, USA

Charlie has been so amazingly busy in such a short amount of time. He is a Research Fellow at Harvard Medical School and the primary curator, developer, and maintainer of several community datasets and databases.These include Bioregistry, which promotes standardization of prefixes, CURIEs, and URIs when used to reference entities/concepts in the life sciences. He contributes to Biomappings, which provides mappings between named biological entities. He created Chemical Roles Graph, which curates mechanistic relations between small molecules and biological processes, pathways, and diseases in an ontological framework. He is a frequent contributor to other curated datasets, and promotes the concept of the Drive-by Curation and of progressive governance models to enable community curation and strengthen project sustainability.

Charlie actively contributes to community efforts; he is an active member of the OBO Foundry ontology community, focusing on promoting standardization of semantics, better curation and coding practices through continuous integration/continuous development and social workflows, promotes more granular attribution and explicit/transparent licensing to better enable reuse.

He contributes to standards development including the SSSOM standard, a simple standard for sharing ontology mappings that includes explicit semantics and provenance, and is a member of Biological Expression Language (BEL), a domain-specific language for representing causal, correlative, and associative relationships between biomedical entities as well as their associated contextual and provenance annotations.

He has mentored a large number of students at the Institute for Algorithms and Scientific Computing (SCAI) and is frequently available to lend support to public projects such as PyBEL and PyKEEN.

Charlie is actively engaged in the Biocuration community and co-chaired the most recent Biocuration 2023 conference in Padova, Italy and participated in the organizing committee of the Virtual Biocuration 2022 Conference. His charisma, enthusiasm and energy are invaluable to our community. His enthusiasm and energy are rare and he deserves to be celebrated.

  • Publications
    • Ontology Development Kit: a toolkit for building, maintaining and standardizing biomedical ontologies. Matentzoglu N; Goutte-Gattat D; Tan SZK; Balhoff JP; Carbon S; Caron AR; Duncan WD; Flack JE; Haendel M; Harris NL et al. Database : the journal of biological databases and curation. (https://doi.org/10.1093/database/baac087)

Samuel Rund, Center for Research Computing, University of Notre Dame, IN, USA

VectorBase.org is one of 14 database repositories that comprise the VEuPathDB.org suite of bioinformatics dataset repositories. It accepts sequencing and population abundance datasets, from global scientific researchers studying vector-eukaryote host-pathogen interactions. Since VEuPathDB’s inception, these databases have grown substantially, and their graphical user interfaces have evolved, allowing not only submission and access to the datasets stored, but also refined searches and data analysis via a workflow simplified by the depiction of icons. Furthermore, VectorBase has a unique feature that permits users to display a map of population counts of selected arbovectors via a geographical information system interface. Since 2018, Dr. Rund has facilitated the storage and access of vector abundance data, participated in VEuPathDB’s public outreach efforts (webinars and workshops), and supported its global user base online. Thanks to Dr. Rund’s efforts and dataset-related publications, the VectorBase repository is continuing to grow not only in size, but also in its importance as an online bioinformatics resource.

  • Publications
    • Amos B, Aurrecoechea C, Barba M, Barreto A, Basenko EY, Bażant W, Belnap R, Blevins AS, Böhme U, Brestelli J, Brunk BP, Caddick M, Callan D, Campbell L, Christensen MB, Christophides GK, Crouch K, Davis K, DeBarry J, Doherty R, Duan Y, Dunn M, Falke D, Fisher S, Flicek P, Fox B, Gajria B, Giraldo-Calderón GI, Harb OS, Harper E, Hertz-Fowler C, Hickman MJ, Howington C, Hu S, Humphrey J, Iodice J, Jones A, Judkins J, Kelly SA, Kissinger JC, Kwon DK, Lamoureux K, Lawson D, Li W, Lies K, Lodha D, Long J, MacCallum RM, Maslen G, McDowell MA, Nabrzyski J, Roos DS, Rund SSC, Schulman SW, Shanmugasundram A, Sitnik V, Spruill D, Starns D, Stoeckert CJ, Tomko SS, Wang H, Warrenfeltz S, Wieck R, Wilkinson PA, Xu L, Zheng J. VEuPathDB: the eukaryotic pathogen, vector and host bioinformatics resource center. Nucleic Acids Res. 2022 Jan 7;50(D1):D898-D911 PMID: 34718728
    • Giraldo-Calderón GI, Harb OS, Kelly SA, Rund SS, Roos DS, McDowell MA. VectorBase.org updates: bioinformatic resources for invertebrate vectors of human pathogens and related organisms. Curr Opin Insect Sci. 2022 Apr;50:100860. doi: 10.1016/j.cois.2021.11.008. Epub 2021 Dec 3. PMID:34864248
    • Catherine A Lippi, Samuel S C Rund, Sadie J Ryan, Characterizing the Vector Data Ecosystem, Journal of Medical Entomology, Volume 60, Issue 2, March 2023, Pages 247– 254 https://doi.org/10.1093/jme/tjad009
    • Rund, Samuel S. C., Braak, Kyle, Cator, Lauren, Copas, Kyle, Emrich, Scott J., Giraldo- Calderón, Gloria I., Johansson, Michael A., Heydari, Naveed, Hobern, Donald, Kelly, Sarah A., Lawson, Daniel, Lord, Cynthia, MacCallum, Robert M., Roche, Dominique G., Ryan, Sadie J., Schigel, Dmitry, Vandegrift, Kurt, Watts, Matthew, Zaspel, Jennifer M., & Pawar, Samraat. MIReAD, a minimum information standard for reporting arthropod abundance data. Scientific Data, 6 (1). https://doi.org/10.1038/s41597-019-0042-5
    • Rund, S. S. C., Moise, I. K., Beier, J. C. & Martinez, M. E. Rescuing troves of data to tackle emerging mosquito-borne diseases. J. Am. Mosq. Control Assoc. 35, 75–83 (2019). PMID:31442186

Nan Zhou, Affiliated Brain Hospital of Guangzhou Medical University, Guangzhou, China

Nan ZHOU started his biocuration career with the FerrDb database in 2019 at the Affiliated Brain Hospital of Guangzhou Medical University. As the first database dedicated to ferroptosis, FerrDb has been cited by more than 300 times and has been selected as ESI Highly Cited Paper and Hot Paper. In the FerrDb project, Nan searched the literature and collected genes and substances regulating ferroptosis, and diseases associated with ferroptosis. His invaluable contributions to the FerrDb database are that he categorized ferroptosis regulators into four functional groups, and that he proposed a quality control pipeline to assign each curation item with a confidence level. These strategies greatly improved the usefulness and credibility of the database. The 1st version of FerrDb was published in the Database (The Journal of Biological Databases and Curation) journal in 2020. After then, he continued curating ferroptosis knowledge to upgrade the database and successfully published the 2nd version of FerrDb in the Nucleic Acids Research journal in 2023. In 2020, Nan also split his time to work with the eccDNAdb team, where his role was to collect and annotate extrachromosomal circular DNA elements in cancer from journal articles. He helped to build the eccDNAdb database which is the first resource for eccDNAs in cancer and was published in the Oncogene journal in 2022. Now he keeps maintaining the FerrDb and eccDNAdb databases. Nan enjoys working as a biocurator and would like to thank the ISB community for this recognition

  • Publications
    • Nan Zhou#; Xiaoqing Yuan#; Qingsong Du#; Zhiyu Zhang; Xiaolei Shi; Jinku Bao*; Yuping Ning*; Li Peng*; FerrDb V2: update of the manually curated database of ferroptosis regulators and ferroptosis-disease associations, Nucleic Acids Research, 2023, 51(0): D571-D582 PMID:36305834
    • Li Peng#*; Nan Zhou#; Chao-Yang Zhang; Guan-Cheng Li; Xiao-Qing Yuan*; eccDNAdb: a database of extrachromosomal circular DNA profiles in human cancers, Oncogene, 2022, 41(0): 2696-2705 PMID:35388171
    • Nan Zhou; Jinku Bao*; FerrDb: a manually curated resource for regulators and markers of ferroptosis and ferroptosis-disease associations, Database, 2020 PMID:32219413

News