Equity, Diversity, Inclusion and Accessibility Officer

The International Society for Biocuration (ISB) is committed to working to build an inclusive and diverse network of biocurators, ontologists, data stewards and others who work to improve the quality of data wherever they may work. The EDI subcommittee has worked hard to establish a set of guidelines to promote equity, diversity, inclusion, and accessibility for the society. With these guidelines in place and with the difficulty in maintaining an active committee in the past year the executive committee has decided to establish an Equity, Diversity, Inclusion and Accessibility Officer. 

This officer will be charged with:

  1. Acting as a point person for ISB members to communicate EDIA concerns.
  2. Reviewing applications for Biocuration conference organizers for any EDIA concerns.
  3. Working with the Biocuration conference liaison to ensure the annual conference is following EDIA guidelines.
  4. Acting as a point person to think ahead for any potential EDIA blindspots.

The past few years have seen the first Biocuration conference in India (2024), the first fully hybrid Biocuration conference (2025), and plans for the first Biocuration conference in Africa (2026). We fund travel fellowships to enable curators from low-income countries to attend Biocuration conferences, We have increased the number of available microgrants and inclusivity grants available to members this year to two of each type. We have also revised and updated our guidelines for conference organizers.

We thank Mary Ann Tuli and the members of the EDI committee for their tireless work over the years to guide the society policies to where they are now.

We thank Luana Licata for volunteering to be the inaugural EDIA Officer!

Archived Data Sets

Last week saw a flurry of messages about how to find archived data sets. This is the list of resources and links from those messages. The bulk of this list came from the Data Rescue Project (@datarescue2025.bsky.social) that was shared by Melissa Haendel. Please check the Data Rescue Project page for new updates. The Data Rescue Project now has a homepage https://www.datarescueproject.org/about-data-rescue-project/

Larger and Established Data / Website Efforts

End of Term Crawl 

  • The main coordinated effort to archive websites
  • Datasets have been more of a challenge, especially data embedded in databases.

EDGI

Public Environmental Data Project 

Harvard’s LIbrary Innovation Lab Team

ICPSR

  • Overview of ICPSR’s data rescue activities to date:
    • Downloaded ~2800 files from various sources requested by researchers; all the files ICPSR collected will soon be available via a dropbox link.
    • Examining CDC data dump from archive.org to assess what might be missing.
      • Ideally will also be a resource for those looking for data to see what is/isn’t available.
    • ICPSR staff and allies are generating metadata for each of the datasets we have so that we can make them available through an existing archive at ICPSR (DataLumos, openICPSR, or the Resource Center for Minority Data, depending on our timeline and some technical issues we’re working out)
  • ICPSR Data Lumos – They have the older version of a lot of major data, including a recent addition from the CDC.

IPUMS

  • They have data and have been working on cataloging efforts
  • Notification went out yesterday that they will share more soon.

Dryad

  • Generalist repository available to help with data publication, storage, and preservation.

Synapse

  • Generalist biology and biomedical data repository available to help with data publication, storage, and preservation.

Silencing Science Tracker

  • Joint initiative of the Sabin Center for Climate Change Law and the Climate Science Legal Defense Fund.
  • Tracks government attempts to restrict or prohibit scientific research, education or discussion, or the publication or use of scientific information.

OSF

  • Generalist repository for archiving, sharing, and storing all types of research outputs, not limited to preprints or only data.
  • OSF is available as an option for pre-prints of articles if, for some reason, they cannot be posted on official sources.
  • Many universities also have institutional repositories where research (articles, data, dissertations, etc) from that institution can be posted. They also have preservation mandates. An example is Penn’s ScholarlyCommons.

The Climate Mirror Project

  • Has NOAA data pulled during the 2017 data rescue.

Open Energy Data Initiative

  • A volunteer has pointed out that “key equity data” is missing from the Dept of Energy. Says they were able to find it on this site. Includes additional data from DOE.

Wayback Machine

Data Rescue Events

Smaller/Ad Hoc Rescue Efforts/ Data Archiving Activists

  • UCSB LSIT Data Mirroring
    • Mirrored and archived public data on locally hosted git server
    • Includes retrieved data sets from CDC, NIH, and NOAA
  • CDC Page on Internet Archive
    • A special archive created on IA of all CDC datasets publicly available as of January 28, 2025
    • uploaded by DataHoarders (we think)
  • Datasets in Dataverse
    • Data uploaded by the Climate Change and Health Research Coordinating Center (CAFE)
      • CAFE is looking for potentially non US based location to duplicate the contents of their collection
    • Includes CDC’s Social Vulnerability Index data.  
    • Most of what’s being placed here is data focusing on health and the environment.
    • DataRefuge from 2017 DataRefuge initiative can be opened for more deposits 
  • Safeguarding Research
    • Organizer is Henrik Schönemann; https://fedihum.org/@lavaeolus
    • There is a forum: https://safeguarding-research.discourse.group/ (admin = Henrik)
      • Based in EU, USA and global – got access to Update 1-2 PB (and more on the way) of storage & people willing to seed
      • Currently, we’ve got around 1TB of data backed up
        • Including >100.000 PDFs from academia.edu (“transgender”, “Queer Studies”, “intersex”, “nonbinary” etc. – see the forum for the full list)
        • 350GB web archive of CDC, including all 30.000 files from archive.cdc.gov And much more
        • “We’re working on providing a central index of archives, with metadata about who archived what, when, to be disseminated widely alongside torrent files and act as both a central point of coordination for archivers to assess what new work is needed, and a mass distribution channel.”
      • Possible contact to CERN, will update asap
  • Data Hoarder
    • A reddit community that is coordinating efforts to rescue data. 
  • Data Hoarding 
    • index of resources and archives related to data hoarding, web archival and self hosting. 
  • ArchiveTeam Warriors
    • They run a distributed crawler. Anyone can install it to help contribute.
    • US Federal Data page
    • Data is uploaded to Archive.org by volunteers
  • Data Liberation Project
    • Note: It looks like the project may have stalled in September 2024. Send info if you know more about them.
    • Run by BigLocalNews and MuckRock, which are good groups to follow.

Tools for Data Rescues

Library Guides to Data Rescues

Articles on current efforts

Articles for context

Existing Alternative Data Sources

Thanks to Brianne Dosch for suggesting the section and some of the bullets.

  • PolicyMap – offers a free tier that can be used to view basic information down to the tract-level, but more detailed data and functionality requires a subscription; available at some universities
  • FRED – They have some demographic data as well; free and open source
  • Census Reporter – is a free, open-source platform focused on making American Community Survey (ACS) data more accessible, including the recent upload of the 2022 1-Year ACS data
  • Esri – for mapping users, the GIS vendor publishes several U.S. Census Bureau data sets, including the ACS, through its ArcGIS Online Platform
  • IPUMS – Even when the government operates normally, many analysts turn to Minnesota Population Center products to access ACS, Current Population Survey microdata and Decennial Census data
  • Social Explorer – historical Census data and more; available at some universities
  • SimplyAnalytics – has internally processed American Community Surveys; available at some universities
  • American College of Obstetricians and Gynecologists – Hosting copies of immunization schedules and contraceptive use guidance from the CDC
  • https://www.ebi.ac.uk/ena/browser/home – The European Nucleotide Archive (ENA) provides a comprehensive record of the world’s nucleotide sequencing information, covering raw sequencing data, sequence assembly information and functional annotation. Mirrors SRA public data

Economic Indicators 

  • National League of Cities: Federal Grant Navigation Equity Dashboard 
    • This tool aggregated data from many sources – it seems to still be able to categorize disadvantaged communities (by environmental and economic standards), as well as other critical data denotations that are increasingly hard to access 
  • ALICE Economic Vitality Dashboard and Report (2022 w/ 2024 update)
    • This resource specifically provides data on work, housing, and community resources for households below the ALICE threshold (Asset Limited, Income Constrained, Employed). The data is provided by the U.S. Census Bureau’s Public Use Microdata Sample (PUMS, 202!) 
  • National Equity Atlas Dashboards
    • A data and policy tool that provides a detailed report card on racial and economic equity – this tool can provide a holistic Racial Equity Index snapchat of communities. The Atlas draws its data from a unique regional equity indicators database developed and maintained by two private institutions: PolicyLink and USC Equity Research Institute ERI.

Public Health 

  • County Health Rankings & Roadmaps (CHR&R)
    • A program of University of Wisconsin’s Population Health Institute, this data tool aims to highlight the symbiotic nature of health and equity by factoring in physical environment, social and economic indicators, clinical care, and health behaviors to health outcomes. 
      • They also recommend these additional health data platforms: 
  • City Health Dashboard
    • From NYU Langone Health, this platform provides 40+ measures of health and factors affecting health across five areas (Health Behaviors, Social and Economic Factors, Physical Environment, Health Outcomes, and Clinical Care) for 970+ cities across the U.S.

Biocuration 2025 Preliminary Schedule of Talks

Schedule of talks for April 7-9

DAY 1

Keynote: Tanya Berger-Wolf

Director of the Translational Data Analytics Institute, Director of Imageomics Institute, PI of AI and Biodiversity Change (ABC) Global Climate Center, Ohio State University.

Day 1, Session 1: Data Standards & Ontologies

  • Encouraging authors to use and cite data in public repositories; a publisher perspective.
    • Bastien Molcrette
    • Data Publications, Data Standards, Fair Data Principles, Public Data Resources
  • DO Spanish: enhancing DEI via a standardized workflow for translating ontology and website content
    • Lynn Schriml
    • Curation, Data Sharing, Disease, Ontologies
  • Harnessing Community Power for Long-Term Success of the Mondo Disease Ontology
    • Sabrina Toro
    • Curation, Data Standards, Disease, Ontologies
  • The Earth Metabolome Initiative Ontology  
    • Tarcisio Mendes de Farias
    • Data Modeling, Knowledge Graphs, Omics Data, Ontologies

Keynote: Nirav Merchant

Director of the Data Science Institute at University of Arizona, PI CyVerse

Day 1, Session 2: Artificial Intelligence

  • From Lab Bench to Web: A Strategy for Making Biomedical Data Findable and Accessible
    • Christina Parry
    • Data Standards, Fair Data Principles, Graph Databases, Repositories
  • Extending Ontology for Biomarkers of Aging using OLIVE
    • Hande Kucuk McGinty
    • Artificial Intelligence, Knowledge Graphs, Large Language Models, Ontologies
  • Building the Lighthouse: Guiding LLM-Powered Biocuration with Domain Knowledge and Context
    • Harry Caufield
    • Generative Artificial Intelligence, Large Language Models, Literature Mining, Ontologies
  • AI Curation Methods for NASA Scientific Data
    • Walter Alvarado
    • Artificial Intelligence, Curation, Large Language Models, Metadata
  • Plant Reactome: A plant pathways Knowledgebase and discovery platform
    • Sushma Naithani
    • Artificial Intelligence, Curation, Functional Gene Annotations, Knowledge Graphs

Day 1, Session 3: Data Sharing, Databases & Knowledgebases

  • Single-cell comparative transcriptomics for hundreds of species?
    • Frederic Bastian
    • Comparative Data, Curation, Data Standards, Gene Expression
  • Epitope-Driven Annotations in Protein Resources
    • Randi Vita
    • Database, ontology, protein, epitope
  • Towards FAIR Phenome: Indian Crop Phenome Database at Indian Biological Data Centre (IBDC)
    • Sonia Balyan
    • Data Sharing, Data Standards, Databases, Phenotypes
  • Making Rare Disease Data Available in the Rare Disease Cures Accelerator-Data and Analytics Platform 
    • Nicole Vasilevsky
    • Curation, Data Sharing, Disease, Fair Data Principles
  • Project ‘Shail’: Curating a mountain
    • Saurabh Raghuvanshi
    • Curation, Databases, Drug Discovery, Genomics 
  • Import of Human GWAS Data and Mapping of EFO to multiple ontologies at the Rat Genome Database
    • Stan Laulederkind
    • Curation, Disease, Genomics, Ontologies

DAY 2

Keynote: Paul Thomas

Director, Division of Bioinformatics, Director of the Gene Sequence, Function, and Health Laboratory Initiative, University of Southern California, PI Gene Ontology, PI PANTHER

Day 2, Session 1: Gene/Protein Functional Prediction 

  • DisProt: The Manually Curated Resource for Intrinsically Disordered Proteins
    • M. Victoria Nugnes
    • Curation, Databases, Ontologies, Proteins
  • A Large Scale Crowdsourcing of the Fifth Critical Assessment of Protein Function Annotation
    • Iddo Friedberg
    • Annotations, Artificial Intelligence, Functional Protein Annotations, Public Data Resources
  • New Synteny visualizations on Xenbase
    • Malcolm Fisher
    • Annotations, Comparative Data, Genomes, Synteny

Day 2, Session 2: Gene/Protein Functional Prediction

  • Cross-species quantification of function annotations provides insights into disease-associated uncharacterized human genes
    • Parnal Joshi
    • Annotations, Comparative Analysis, Data Analysis, Functional Protein Annotations
  • Leveraging the AlphaFold Database for enhanced protein function annotation
    • Paulyna Magaña
    • Annotations, Functional Protein Annotations, Protein Structure Prediction, Proteins
  • Leveraging Large Language Models for Gene Summary Generation at the Alliance of Genome Resources
    • Valerio Arnaboldi
    • Large Language Models, Literature Mining, Automated Gene Summaries, Text Summarization
  • Life Cycle Events for Protein Family Models: Birth, Maturation, Cloning, Retirement
    • Daniel Haft
    • Bacteria, Data Sharing, Functional Protein Annotations

Keynote: Andy Hickl

Chief Technology Officer, Allen Institute

Day 2, Session 3: Natural Language Processing 

  • Semi-automated curation of post-translational modification relationships using automated knowledge extraction and assembly
    • Benjamin Gyori
    • Artificial Intelligence, Curation, Databases, Literature Mining
  • Characterization and automated classification of sentences in the biomedical literature: a case study for biocuration of gene expression and protein kinase activity
    • Daniela Raciti
    • Curation, Machine Learning, Community Curation, Sentence Classification
  • Enhancing the SIB Literature Services with annotations to support biocuration
    • Deborah Caucheteur
    • Annotations, Curation, Data Analysis, Literature Mining
  • Protein structure enrichment through text mining
    • Melanie Vollmar
    • Annotations, Literature Mining, Natural Language Processing, Protein structures
  • Enhancing data annotation in ChEMBL for robust analyses
    • Sybilla Corbett
    • Annotations, Curation, Fair Data Principles, Natural Language Processing

Day 2, Session 4: Glycans 

  • Glycan Archetypes: definitions, implementations and applications for standardizing glycan structure data
    • Kiyoko Aoki-Kinoshita
    • Data Standards, Databases, Glycans, Ontologies
  • BiomarkerKB: Biomarker-centric data modeling and knowledge integration for translational research
    • Raja Mazumder
    • Curation, Databases, Glycans, Knowledge Graphs
  • Inferring Tissue and Cell-type Glycosyltransferase Specificity from Single-Cell Gene Expression Data
    • Nathan Edwards
    • Annotations, Glycans, Machine Learning

DAY 3

Keynote: Shannon Farrell

Data Curation Network/Univ. Minnesota

Day 3, Session 1: Data Curation 

  • It’s Now or Never: Delays in Biocuration Disproportionately Affect Understudied Proteins
    • An Phan
    • Curation, Data Analysis, Functional Gene Annotations, Literature Mining
  • How have standards in genomics evolved since the first microbial genome was published 3 decades ago?
    • Chris Hunter
    • Data Standards, Metadata, Ontologies, Repositories

Keynote: Sandra Orchard

ISB 2023 Exceptional Contribution to Biocuration Awardee, EMBL-European Bioinformatics Institute – UK

Day 3, Session 2: Data Curation Databases, Infrastructure, Literature Mining, Public Data Resources

  • The global biodata infrastructure: how, where, who, and what?
    • Chuck Cook
    • Databases, Infrastructure, Literature Mining, Public Data Resources

Executive Committee Candidates – 2024

The election of three members of the International Society for Biocuration Executive Committee (ISB EC) will be held from September 25th – October 2nd, 2024.

Emails will be sent to current members on September 26thOnly current members, as of September 24th, 2024, who receive this email will be allowed to vote. Please note that if you are an ISB member and do not receive the email, please contact us at isb@biocurator.org.

We thank all of the following seven candidates for agreeing to stand for election to the Executive Committee (EC). Information about the candidates standing for election to the Executive Committee (EC) is available below:


Sonia Balyan

Position: Scientist

Affiliation: Indian Biological Data Centre, Regional Centre for Biotechnology, Faridabad, India

Biosketch: As a dedicated scientist specializing in plant molecular biology, biotechnology, and bioinformatics, my career is rooted in cutting-edge research and the crucial field of biocuration. With over 7 years of post-Ph.D. experience, I have developed deep expertise in managing and curating complex biological data, particularly in plant genomics and phenomics. My work is driven by a commitment to transforming raw data into accessible, well-annotated resources that are indispensable for advancing agricultural research and promoting environmental sustainability.

At the Indian Biological Data Centre (IBDC) in Faridabad, where I currently serve as a Scientist, I have played a pivotal role in pioneering initiatives that enhance the utility of big data. A cornerstone of my career has been the development of the Indian Crop Phenome Database, a unique resource designed to archive crop phenome data and ensure it adheres to FAIR principles (Findable, Accessible, Interoperable, Reusable). I am currently spearheading the Indian Functional Genomics DataBank (IFGDb) project, which aims to create a comprehensive repository for functional genomics data, as well as the Indian Research Data Archive (IRDA), a repository for the archival of diverse research data. Additionally, I lead the genomics module of the Multiomics Analysis Toolbox at IBDC, a groundbreaking platform that facilitates the analysis of multidimensional biological data to unravel the complexities of life sciences research. These initiatives not only advance biological sciences but also highlight the essential role of biocuration in ensuring that data-driven research is both meaningful and impactful.

My contributions have been recognized within the scientific community through several publications in good-impact journals, including training and outreach activities. My passion for science communication is further evident in my role as the host of the Beyond Shodh podcast, where I engage with leading Scientists/researchers to share their research, vision and journeys to a broader audience. Through my work in biocuration, I strive to bridge the gap between data and discovery, empowering researchers to unlock new insights that drive innovation in biological sciences and beyond.

Motivation: As a scientist at India’s first Biological Data Centre, I have been actively involved in structuring and archiving our nation’s invaluable biological data. This work has deepened my understanding of the essential role biocuration plays in not just preserving data, but in making it accessible and impactful for the global scientific community. My experience has fueled a strong commitment to upholding the highest standards of data integrity and usability—values I will bring to the ISB Executive Committee.

In 2024, I led the local organizing committee for the 17th Annual International Biocuration Conference in India, where I gained firsthand experience in fostering global collaboration and advancing biocuration practices. This role provided me with a comprehensive understanding of the field’s challenges and opportunities, equipping me to contribute effectively to ISB’s strategic initiatives.

As the host of Beyond Shodh, I have actively engaged with scientists across India, highlighting the journeys of pioneers in STEM. I also manage outreach for the Indian Biological Data Centre, focusing on engaging and motivating researchers about the importance of FAIR data and data stewardship. I am passionate about expanding ISB’s outreach efforts, ensuring biocuration’s significance is effectively communicated to emerging scientists and policymakers. My goal is to strengthen the global integration of diverse datasets, particularly from underrepresented regions, to build a more inclusive and comprehensive biocuration community.


Marija Milacic

Position: Scientific Associate/Biocurator

Affiliation: Computational Biology, Ontario Institute for Cancer Research, Toronto, Ontario, Canada

Biosketch: My training and wet lab experience involve an undergraduate degree in Molecular Biology and Physiology from the University of Belgrade, where I took part in human papillomavirus research, doctoral degree in Molecular and Medical Genetics from the University of Toronto, where I studied childhood cancer retinoblastoma, and postdoctoral training at the Centre for Addiction and Mental Health in Toronto, where I studied genetics of autism spectrum disorders. My career in biocuration started in 2011, when I joined the Ontario Institute for Cancer Research and became part of the Reactome curation team. Becoming a biocurator enabled me to apply my skillset to building and maintaining a repository of biological pathways used by researchers globally. It also provided me with an insight into the importance of standardized representation of knowledge in biology and medical research, where open and continuous communication between researchers, publishers, and biocurators is key. I would like to continue my career in the field of biocuration because I believe that systematization and synthesis of knowledge is crucial for the ethical advancement of basic and applied biological sciences.

Motivation: As an ISB member for ten years now and a biocurator for more than thirteen years, I greatly appreciate the ISB’s efforts in bringing together biocurators from many different areas of biology, providing professional guidance, and improving the visibility of this profession. It would be an honor to contribute to the ISB as a member of its Executive Committee. As a biocurator of a peer-reviewed pathway database, I would bring to the Executive Committee experience in the areas of data visualization and community curation. The areas within the ISB that I would like to see developed are Training, Outreach, and Communication, and the IT Infrastructure.


Maria Victoria Nugnes

Position: Senior biocurator and trainer of the DisProt database (https://disprot.org/)

Affiliation: BioComputing UP Lab (https://protein.bio.unipd.it/) at University of Padua, Italy

Biosketch: I am a senior biocurator for DisProt (https://disprot.org/), database of manually curated intrinsically disordered proteins (IDPs) from literature, working both remotely from my country, Argentina, and in person at Silvio Tosatto’s lab at the University of Padova, in Italy. In this role, I curate and revise contributions from over 40 community curators spanning various countries from all over the world. I first joined DisProt as a curator in 2021 and have been focused on biocuration as my primary research interest since then. Over the course of the last two years, I assumed leadership responsibilities also for database thematic datasets and I have had the opportunity to manually curate over 1,200 publications. In addition to my curation duties, I design and deliver training materials and sessions for DisProt biocurators. My training sessions are conducted both remotely and in person, with recent sessions held in Argentina and Chile. My work is driven by a passion for enhancing our understanding of IDPs and supporting the global community of biocurators.

Motivation: My career in biocuration is dedicated to improving the quality and accessibility of biological data, particularly concerning intrinsically disordered proteins (IDPs). I am from Argentina, where I work mainly remotely from, and I am deeply committed to helping people from lower-middle-income countries gain access to the tools, knowledge, and support needed to excel in biocuration.

If given the opportunity to serve on the ISB Executive Committee, I will bring:

Expertise in IDP Curation: My extensive experience with the DisProt database, best practices, ontologies and standards, that will help me to provide valuable insights into the curation processes.

Commitment to Education and Training: I am passionate about supporting and mentoring fellow curators. I have conducted numerous training sessions in English and in Spanish, both virtual and in-person, including a Spanish-language DisProt Biocuration course on the ELIXIR-SI eLearning platform. I aim to expand these educational initiatives to benefit curators worldwide.

Dedication to Collaboration and Community Building: My involvement in international projects, such as HUPO Proteomics Standards Initiative and Gene Ontology, have fostered strong collaborative skills and I am dedicated to building a supportive and connected biocuration community.

Within ISB, I would like to focus on the following areas:

  • Enhanced Training and Support: Develop comprehensive and accessible training programs, including multilingual options, to support curators from diverse backgrounds and regions in their career progression, especially those from lower-middle-income countries.
  • Standardization and Best Practices: Advocate for the adoption of standardized curation practices across databases, facilitating data integration and interoperability.
  • Community Engagement and Collaboration: Strengthen the global biocuration community by encouraging collaborative projects and networks, fostering a sense of common goal and mutual support among curators.

I am deeply honored to be this year’s recipient of the ISB Excellence in Biocuration Early Career Award 2024. This recognition increases even more my dedication to supporting and uplifting the global biocuration community. I am eager to contribute to the ISB Executive Committee, bringing enthusiasm, dedication, and a collaborative spirit to support and elevate our biocuration community even further.


Santhi Ramachandran

Position: Curator

Affiliation: GWAS Catalogue Curator, Cambridge, UK

Biosketch: I have been working in the field of biocuration since 2012, with a primary focus on variant curation. Over the years, I have gained extensive experience by working with prominent organizations and now at EMBL-EBI. My work has provided me with a strong foundation in biocuration practices, and I have also had the opportunity to engage with users of the curated data, gaining a deeper understanding of its practical applications in research and healthcare. This interaction has shaped my approach to curation, ensuring that the data we provide is accurate, reliable, and impactful for both scientific and societal advancement.

Motivation: I am motivated to run for a position on the ISB Executive Committee due to my decade-long experience in biocuration and my desire to contribute to the field’s growth. Over the past decade, I have worked with various organizations, including my current role at EMBL-EBI, where I have developed expertise in curating high-quality, impactful data. This experience has deepened my understanding of the complexities of biocuration and its vital role in supporting research and healthcare. I believe that the success of biocuration depends not only on data accuracy but also on how effectively this data is applied in real-world settings.

As a member of the ISB Executive Committee, I aim to contribute to the broader advancement of biocuration by leveraging my strong background in curation and my insights into the evolving needs of data users. I am particularly eager to promote stronger connections between curators and end-users, enhancing the practical impact of curated data. Additionally, I would advocate for expanding training programs and resources to support the next generation of biocurators, ensuring that ISB continues to lead the way in this critical field. Through these efforts, I hope to enhance the impact of biocuration within the scientific community.


Umasri Sankarlal

Position: Freelance Biocurator

Affiliation: Freelancer

Biosketch: My career started as a Biocurator participating in various projects involved in building comprehensive databases for Biomarker, Biological Pathway and Chemical compounds. Since 2004, my passion in biocuration has been growing, and volunteered myself wherever I got an opportunity. During my career break, I worked as freelance ontology mentor to developers, shared ideas on developing an ontology for online databases and supporting with necessary datasets for their project. I worked as short time Consulting Analyst with Thomson Reuters, Chennai on their Drug Forecasting database. Even though working a full-time employee in IT firm, I volunteer to be a member of ClinGen Intellectual Disability and Autism Gene curation working group panel and publishing the curated genes after my office hours.

Motivation: I nominate for a position in Equity, Diversity and Inclusion Committee. In my 20 years of work experience as a Biocurator, Patent analyst, IT Operation Analyst, Autosys Specialist and currently as Software Quality Analyst acquired technical skills with the help of diverse expertise team I worked with in various companies. I have always been treated equally without any gender bias and got motivated at my work place. The team have included people from various cultures and countries, and always supported in sharing their expertise knowledge to improve my career. Now it is my turn to return their kindness to others who need encouragement and support. Apart from doing at my current workplace, the role as a member of EDI committee in ISB will be more meaningful and have impact in Biocuration field.


Peter Uetz

Position: Associate Professor

Affiliation: Center for Biological Data Science, School of Life Sciences, Richmond, VA, USA

Biosketch: I have started the Reptile Database in 1996 when I was a graduate student at EMBL, Heidelberg, Germany. Since then, the database has become one of the most comprehensive biodiversity databases worldwide (being one of 160 databases in the Catalogue of Life consortium). I have also founded the Microbial Protein Interaction database, now a part of IntAct. After being an investigator at The Institute for Genomic Research (TIGR) and the Venter Institute (JCVI) for about 5 years I was hired as associate professor at VCU in Richmond, Va, in 2011. I am teaching courses in functional genomics and bioinformatics there, training students in biodiversity data-related curation and processing. The database and various papers about it have been cited 4000-5000 times.

Motivation: Biocuration is an undervalued part of biomedical sciences and I feel strongly that its role needs to be better promoted and students trained in this field. While all biomedical scientists are using various databases, they rarely appreciate the huge amount of work required to establish and maintain these data sources. My long career experience in database curation and promotion will help the ISB to promote and advance the field, although I am currently focusing on biodiversity data, an area that has been underrepresented at ISB, hence one of my goals is to strengthen the links to biodiversity and other biological sub-disciplines that are not well covered by ISB.


Huajin Wang

Position: Senior Librarian

Affiliation: Carnegie Mellon University

Biosketch: I am a Cell Biologist turned information professional and open science advocate. As a Senior Librarian at Carnegie Mellon University Libraries, I provide consultation and develop programs that help research communities make their research data and outputs more open, reproducible, and reusable, foster collaboration across disciplinary boundaries, and engage stakeholders to build a healthy data ecosystem. I have been in consulting roles for data curation and management on many research projects. As a life sciences researcher, I have collaborated with biologists, clinicians, and data scientists on many successful research projects in areas spanning membrane trafficking, lipid metabolism, bioinformatics, and management and analysis of multimodal datasets. I completed my PhD in Cell Biology at University of Alberta, postdoctoral research at Yale/Harvard, and independent research at Carnegie Mellon University. I was a co-founder and Director of the Open Science & Data Collaborations Program at CMU. I served a senior leadership role at the Center for Open Science to drive culture change in research communities with technology, community building, services, and thought leadership. I am a member of the Data Curation Network, FORCE11, and other organizations that contribute to the scholarly data ecosystem. I am also on the Advisory Committee of the UK Reproducibility Network (UKRN).

Affiliation: The biggest asset that makes me an excellent fit for the Executive Committee is my rich and multifaceted experience with data in the many roles I play – a scientist, a data steward, a community builder, and a strategic leader. During my career as a scientist, I worked with large varieties of research data and data formats, and deeply appreciated the value of well curated, open datasets. When I move into a librarian and open science role, it has been my mission to help researchers make data more open and reusable so that the results are more reproducible. I co-founded the Open Science & Data Collaborations program at CMU, and served as the Director of Program at the Center for Open Science (the nonprofit running OSF). In these roles, I helped researchers and communities navigate data sharing, stewardship and curation. I oversaw the strategy and execution of training and consulting services, launched community engagement and outreach initiatives, and supported communities and stakeholders to adopt data sharing infrastructure and best practices. The communities I worked with span a wide range of life sciences fields including neuroscience, cancer biology, genomics, and virology, with an emphasis on early career researchers and under represented groups.

2024 Nominations for the Excellence in Biocuration Advanced Career Award

The Advanced Career Award recognizes biocurators who have been involved in a biocuration-relevant field for 7 years or more. The nominees will have made sustained contributions to the field of biocuration. The recipient will be required to present a 15 minute talk at a virtual Biocuration seminar and will be sent a prize of 500 CHF. The nominee does not have to be an active ISB member, as the award will include ISB membership for 1 year.

Voting will be open from 27 June – 25 July 2024

NOMINEES

The list of nominees is below. Scroll down for detailed descriptions.

  • Peter D’Eustachio, NYU Grossman School of Medicine, New York, USA
  • Steven Marygold, FlyBase, Cambridge, UK
  • Sushma Naithani, Plant Reactome Knowledgebase, Oregon, USA
  • Achchuthan Shanmugasundram, Genomics England Limited, London, UK
  • Peter Uetz, Center for Biological Data Science, Virginia, USA

Detailed Descriptions

Peter D’Eustachio, NYU Grossman School of Medicine, New York, USA

Peter has been involved with the Reactome Knowledgebase since 2002, developing standards and procedures for curation of pathway events and tight integration of Reactome annotation with material from community resources such as the Gene Ontology, ChEBI, UniProt, and Rhea, and more recently the Alliance of Genome Resources to support efficient annotation of biologically complex processes in formats that are widely accessible and interoperable. This interoperability extends to the use of Reactome annotations of human processes as starting points for efficiently annotating model organism versions of these processes in formats that themselves support easy and reliable data re-use. Peter’s direct contributions have centered on the development of Reactome curation standards, development of tools to align Reactome annotations with corresponding instances in resources such as Rhea, and the development of a strategy to export Reactome content into the GO-CAM format that is rapidly becoming a bioinformatics community standard. Peter’s knowledge of a wide range of biological processes and his understanding of the underlying biochemistry enables him to clearly explain the most logical solutions to many challenging problems. Many gold standard biological knowledgebases have benefited from his insightful and considered suggestions.

Publications

  • D’Eustachio P. (2013) Pathway databases: making chemical and biological sense of the genomic data flood. Chem Biol. 20:629-635. PMID:23706629
  • Hill DP, D’Eustachio P, Berardini TZ, Mungall CJ, Renedo N, Blake JA (2016) Modeling biochemical pathways in the gene ontology. Database baw126. PMID:27589964
  • Good BM, Van Auken K, Hill DP, Mi H, Carbon S, Balhoff JP, Albou LP, Thomas PD, Mungall CJ, Blake JA, D’Eustachio P. (2021) Reactome and the Gene Ontology: Digital convergence of data resources. Bioinformatics. 37:3343–3348. PMID:33964129
  • Gillespie M, Jassal B, Stephan R, Milacic M, Rothfels K, Senff-Ribeiro A, Griss J, Sevilla C, Matthews L, Gong C, Deng C, Varusai T, Ragueneau E, Haider Y, May B, Shamovsky V, Weiser J, Brunson T, Sanati N, Beckman L, Shao X, Fabregat A, Sidiropoulos K, Murillo J, Viteri G, Cook J, Shorser S, Bader G, Demir E, Sander C, Haw R, Wu G, Stein L, Hermjakob H, D’Eustachio P (2022) The Reactome Pathway Knowledgebase 2022. Nucleic Acids Res. 50:D687-D692. PMID:34788843

Steven Marygold, FlyBase, Cambridge, UK

Steven is a curator, nomenclature advisor and the group coordinator at FlyBase (Cambridge UK) with a special interest in enzymes, metabolism, ncRNAs and gene groups. Steven’s team established the gene group resource https://academic.oup.com/nar/article/44/D1/D786/2502590 and has worked systematically to improve the annotation of many classes of genes at FlyBase including most recently tRNAs, snoRNAs, and reviewing all ~3700 enzyme-encoding genes. He collaborates widely with many other data resources including the Alliance of Genome Resources member databases, UniProt, Rhea, Reactome, RNACentral and members of the fly community. Steven is also a major contributor to the Gene Ontology Consortium, and his work here is improving the representation of enzymes and metabolism across species in in both the ontology and annotation.

  • Publications
    • The UDP-Glycosyltransferase Family in Drosophila melanogaster: Nomenclature Update, Gene Expression and Phylogenetic Analysis Ahn, S.-J., Marygold, S.J. Frontiers in Physiology, 2021, 12, 648481 PMID: 33815151
    • The DNA polymerases of Drosophila melanogaster Marygold, S.J., Attrill, H., Speretta, E., …Rong, Y., Yamaguchi, M. Fly, 2020, 14(1-4), pp. 49–61 PMID: 31933406
    • Towards comprehensive annotation of Drosophila melanogaster enzymes in FlyBase Garapati, P.V., Zhang, J., Rey, A.J., Marygold, S.J. Database, 2019, 2019 PMID: 30689844
    • Using FlyBase to find functionally related drosophila genes Rey, A.J., Attrill, H., Marygold, S.J. Methods in Molecular Biology, 2018, 1757, pp. 493–512 PMID: 29761468
    • The translation factors of Drosophila melanogaster Marygold, S.J., Attrill, H., Lasko, P. Fly, 2017, 11(1), pp. 65–74 PMID: 27494710

Sushma Naithani, Plant Reactome Knowledgebase, Oregon, USA

Dr. Sushma Naithani is an Associate Professor Senior Research at Oregon State University. Naithani is the Lead biocurator of the Plant Reactome knowledgebase ((https://plantreactome.gramene.org), a pathway portal of the Gramene database. Since 2012, she has led the biocuration of plant pathways and served as outreach and training coordinator for database users. In addition, Naithani has led the development and biocuration of species-specific metabolic network for strawberry (FragariaCyc) and grapevine (VitisCyc) and contributed to the biocuration of maize-specific metabolic network MaizeCyc— all Cyc databases are available at http://pathways.cgrb.oregonstate.edu/metabolic.html. Naithani has trained 20 undergraduate students at Oregon State University on gene and pathway biocuration; conducted two in-person community curation workshops; and two online biocuration training workshops for the graduate students. Naithani also made significant contribution to Gene Ontology, and Grapevine Information system. From 2022-present, Naithani also serves as the Steering Committee Member for the AgBiodata consortium (https://www.agbiodata.org), a consortium of agricultural biological databases and associated resources. Naithani co-organized the ‘Ontologies and Systems Biology workshop’ at the annual Plant and Animal Genome conference (2014-2024). Naithani’s involvement in promoting equity and diversity within the scientific community is evident through her service on the ISB Equity, Diversity, and Inclusion (EDI) committee from 2021 to 2022. Notably, she chaired the ‘Addressing Implicit or Unconscious Bias Workshop’ at the 14th Annual Biocuration Conference in 2021. In subsequent years, she co-chaired the 15th Annual Biocuration Conference in 2022 (virtual) and served on the Scientific Committee for the 17th Annual Biocuration Conference held in India in 2023.

  • Publications
    • Gupta P., J. Elser, E. Hooks, P. D’Eustachio, P. Jaiswal and S. Naithani (2024). Plant Reactome Knowledgebase: empowering plant pathway exploration and OMICS data analysis. Nucleic Acids Res., 52 (D1): D1538-D1547. doi:10.1093/nar/gkad1052. PMID: 37986220
    • Gene Ontology Consortium (2023). The Gene Ontology Knowledgebase in 2023. Genetics, doi: 10.1093/genetics/iyad031 PMID: 36866529
    • Tello-Ruiz M.K., S. Naithani, P. Gupta, A. Olson, S. Wei, J. Preece, Y. Jiao, B. Wang, K. Chougule, P. Garg, J. Elser, S. Kumari, V. Kumar, B. Contreras-Moreira, G. Naamati, N. George, J. Cook, D. Bolser, P. D’Eustachio, L.D. Stein, A. Gupta, W. Xu, J. Regala, I. Papatheodorou, P.J Kersey, P. Flicek, C. Taylor, P. Jaiswal, and D. Ware (2021). Gramene 2021: harnessing the power of comparative genomics and pathways for plant research. Nucleic Acids Res. 49 (D1): D1452-D1463. doi: 10.1093/nar/gkaa979 PMID: 33170273
    • Harper L., J. Campbell, E.K.S. Cannon, S. Jung, M. Poelchau, R. Walls, C. Andorf, E. Arnaud, T. Berardini, C. Birkett, S. Cannon, J. Carson, B. Condon, L. Cooper, N. Dunn, C. G. Elsik, A. Farmer, S.P. Ficklin, D. Grant, E. Grau, N. Herndon, Z.-L. Hu, J. Humann, P. Jaiswal, C. Jonquet, M.-A. Laporte, P. Larmande, G. Lazo, F. McCarthy, N. Menda, C. Mungall, M. Munoz-Torres, S. Naithani, et al. (2018). AgBioData consortium recommendations for sustainable genomics and genetics databases for agriculture. Database, 2018:1-32, doi: 10.1093/database/bay088 PMID: 30239679
    • Naithani S., R. Raja, E.N. Waddell, J. Elser, S. Gouthu, L.G. Deluc and P. Jaiswal (2014). VitisCyc, a metabolic pathways knowledgebase for grapevine (Vitis vinifera). Frontiers in Plant Science, 5: 1, doi: 10.3389/fpls.2014.00644. PMID: 25538713

Achchuthan Shanmugasundram, Genomics England Limited, London, UK

Achchuthan began his biocuration journey as a PhD student at the University of Liverpool, where he manually curated metabolic pathways for Apicomplexan parasites and developed a web database. Following a brief postdoctoral period focusing on genome annotation and curation, he returned to Liverpool as a biocurator, working on the genomes of neglected tropical parasites and fungal pathogens in the non-model organism database VEuPathDB. During this time, he played a pivotal role in developing VEuPathDB’s framework for phenotypic curation of fungal pathogens by leveraging existing OBO Foundry ontologies.

A strong proponent for data standardisation and sharing, Achchuthan often worked as the sole biocurator at VEuPathDB. He actively collaborated with other resources to better represent the taxon-specific biology of neglected species in ontologies such as Gene Ontology (GO). His contributions included submitting VEuPathDB’s annotations to GO and coordinating the integration of curated data from other resources into VEuPathDB.

Achchuthan contributed to outreach efforts, educating users on the importance of curated datasets, working on standardisation and integration of user curated datasets, and developing SOPs and frameworks for community curation of gene functions and phenotypic datasets in VEuPathDB. He also embedded curation and ontologies into his genomics teaching at Liverpool.

Achchuthan is currently working with Genomics England’s PanelApp, curating gene-disease associations within gene panels. His focus has always been on the complete data cycle, ensuring appropriate representation of curated data by contributing to the development and testing of new features. Despite transitioning through various roles, curation has remained at the heart of his career.

  • Publications
    • Shanmugasundram, A. Increasing gene coverage for developmental disorders in PanelApp using the Gene2Phenotype database. Genomics England bioinformatics blog series. 2 May 2024; https://www.genomicsengland.co.uk/blog/increasing-gene-coverage-for-developmental-disorders-in-panelapp-using-the-gene2phenotype-database.
    • Basenko, EY., Shanmugasundram, A., Boehme, U., Starns, D., Wilkinson, PA., Davison, HR., et al. What is new in FungiDB: a web-based bioinformatics platform for omics-scale data analysis for fungal and oomycete species. Genetics. 2024; 227(1): iyae035. PMID: 38529759.
    • Alvarez-Jarreta, J., Amos, B., Aurrecoechea, C., Bah, S., Barba, M., Barreto, A., et al. VEuPathDB: the eukaryotic pathogen, vector and host bioinformatics resource center in 2023. Nucleic Acids Research. 2024; 52(D1): D808-D816. PMID: 37953350.
    • Shanmugasundram, A., Starns, D., Boehme, U., Amos, B., Wilkinson, PA., Harb, OS., et al. TriTrypDB: An integrated functional genomics resource for kinetoplastida. PLoS Neglected Tropical Diseases. 2023; 17(1): e0011058. PMID: 36656904.
    • Shanmugasundram, A., Gonzalez-Galarza, FF., Wastling, JM., Vasieva, O., Jones, AR. Library of Apicomplexan Metabolic Pathways: a manually curated database for metabolic pathways of apicomplexan parasites. Nucleic Acids Research. 2013; 41(D1): D706-D713. PMID: 23193253.

Peter Uetz, Center for Biological Data Science, Virginia, USA

The nominee has founded the EMBL Reptile Datatabase as a graduate student in 1995. It was thus one of the first biodiversity databases worldwide. Uetz has been its lead curator since its inception for almost 30 years. After the last contributor to the database, Ramu Chenna, had left EMBL, the database moved to TIGR in 2008 and then to an independent site in 2010. Uetz has curated about 50,000 papers over the past 30 years for a database that receives more than 2000 queries per day and has been cited more than 4,000 times in the scientific literature. He has also trained numerous student curators, programmers, and other volunteers who have helped to maintain the database. The Reptile Database is now a key contributor and collaborator of many other data sources, such as the NCBI Taxonomy, iNaturalist (a citizen science project), IUCN (Red List of threatened species), the Catalogue of Life, and others. Uetz has published about a dozen papers about the database and its history. The database contains descriptions of ~12,000 species of reptiles, >20,000 images, geographic and taxonomic information, as well as links to about 55,000 references online. The nominee also maintains a mailing list of about 5,000 people who receive regular updates on reptile taxonomy and topics related to the Reptile Database. More recently, work on the database included the establishment of standardized vocabularies (ontologies), the use of AI such as ChatGPT for data mining, and collaborations on image analysis and automated species recognition.

  • Publications
    • Uetz, P. & Etzold, T. 1996 The EMBL/EBI Reptile Database. Herpetological Review 27: 174-175
    • Uetz, P. & Stylianou, A. 2018 The original descriptions of reptiles and their subspecies. Zootaxa 4375 (2): 257-264, https://doi.org/10.11646/zootaxa.4375.2.5
    • Uetz P, Cherikh S, Shea G, Ineich I, Campbell PD, Doronin IV, Rosado J, Wynn A, Tighe KA, Mcdiarmid R, Lee JL, Köhler G, Ellis R, Doughty P, Raxworthy CJ, Scheinberg L, Resetar A, Sabaj M, Schneider G, Franzen M, Glaw F, Böhme W, Schweiger S, Gemel R, Couper P, Amey A, Dondorp E, Ofer G, Meiri S, Wallach V. 2019 A global catalog of primary reptile type specimens. Zootaxa 4695 (5): 438–450, https://doi.org/10.11646/zootaxa.4695.5.2
    • Uetz P, Koo MS, Aguilar R, Brings E, Catenazzi A, Chang AT, Chaitanya R, Freed P, Gross J, Hammermann M, Hosek J, Lambert M, Sergi Z, Spencer CL, Summers K, Tarvin R, Vredenburg VT, Wake DB 2021 A Quarter Century of Reptile and Amphibian Databases. Herpetological Review 52 (2): 246-255
    • Uetz, P.; Darko, Y.A.; Voss, O. 2023 Towards digital descriptions of all extant reptile species. Megataxa 010 (1): 027–042, https://doi.org/10.11646/megataxa.10.1.6

Excellence in Biocuration Early Career Award 2024 Nominations

The Early Career Award recognizes biocurators who have been involved in a biocuration-relevant field for less than 7 years. The nominees are in a non-leadership position and have made sustained contributions to the field of biocuration. The recipient will be required to present a 15 minute talk at a virtual Biocuration seminar and will be sent a prize of 500 CHF. The nominee does not have to be an active ISB member, as the award will include ISB membership for 1 year.

Voting will be open from 27 June – 25 July 2024

NOMINEES

The list of nominees is below. Scroll down for detailed descriptions.

  • Robert Giessmann, Institute for Globally Distributed Open Research and Education (IGDORE), Technical University Berlin, Germany
  • Scott V. Nguyen, American Type Culture Collection, University Blvd, Manassas, Virginia, USA
  • Maria Victoria Nugnes, University of Padova, Italy
  • Umasri Sankarlal, Freelance Biocurator, India

Detailed descriptions

Robert Giessmann, Institute for Globally Distributed Open Research and Education (IGDORE), Germany

Robert is the community facilitator and creator of the openTECR database (https://github.com/opentecr/opentecr). This community creates an open database that simplifies access to and sharing of thermodynamic data for biochemical reaction, built by and for modellers and experimentalists alike.

Robert earns his living with other things, but voluntarily took up the effort to reach out to scientists interested in biothermodynamics. While everyone agreed that a database like openTECR would make sense, there was no momentum to make it happen as a collective, and Robert worked solitarily on the project from 2021 on. He collected >1000 primary publications, and visited many libraries to digitize old scientific articles himself. To grow a community, in late 2023 Robert received mentorship in Open Life Science’s “Open Seeds” program. He presented the project at several conferences, ran a globe-spreading online hackathon and finally, in 2024, invited the global biocuration community to community-curate the data. This invitation attracted 17 curators who contributed a total of 100 working hours over 4 months, all voluntarily; now 40 people are on openTECR’s mailing list. From time to time, Robert works in the lab to generate new data and publishes his lab notes immediately online under an open license.

Robert believes that the next wave of life science databases should be hosted in the open, with open data, open infrastructure, and open code. He is currently exploring how git and GitHub/Lab can serve as a replacement to traditional databases, including quality control and linking to other databases by CI/CD actions.


Scott V. Nguyen, American Type Culture Collection, University Blvd, Manassas, Virginia, USA

Scott is a senior biocurator for the American Type Culture Collection Genome Portal (AGP). He ensures the pedigree of genome assemblies in the AGP are directly sourced to physical materials in the repository. Sequencing data is also paired with historical metadata within the nearly century old collection that spans handwritten notes to corresponding letters from depositors to modern digital records.
Scott is also a volunteer biocurator for the Yersinia section of EnteroBase (https://enterobase.warwick.ac.uk/species/index/yersinia), a genomic database for enteric pathogens that provides core genome multilocus sequencing typing (cgMLST) to help researchers identify population structures for important pathogens.
Prior to joining the ATCC as a biocurator, Scott curated epidemiological metadata and submitted nearly 6500 SARS-CoV-2 genomes to the GISAID database on behalf of the District of Columbia Public Health Laboratory. These genomes and associated metadata helped inform epidemiologists of current SARS-CoV-2 trends within the Washington, DC metropolitan region. In addition to contributing to GISAID, he also actively researched emerging coronavirus lineages and discovered three new variants for the SARS-CoV-2 Pango lineages.
One Pango variant, the XD Delta-Omicron recombinant, was monitored by the WHO as a variant under monitoring (VUM) in 2022 (https://www.nytimes.com/2022/03/11/science/deltacron-coronavirus-variant.html). Through this work, he also informally works with other SARS-CoV-2 variant hunters across the globe and helps volunteer scientists gain access to the GISAID database to help track emerging variants (https://x.com/LongDesertTrain/status/1783670135103926697). Additionally, he mentored undergraduate interns as a volunteer with the Metropolitan Washington Council of Governments to monitor SARS-CoV-2 trends within the Washington, DC area with GISAID data.

  • Publications
    • The ATCC genome portal: 3,938 authenticated microbial reference genomes. Genomics and Proteomics. Scott V Nguyen, Nikhita P Puthuveetil, Joseph R Petrone, Jade L Kirkland, Kaitlyn Gaffney, Corina L Tabron, Noah Wax, James Duncan, Stephen King, Robert Marlow, Amy L Reese, David A Yarmosh, Hannah H McConnell, Ana S Fernandes, John Bagnoli, Briana Benton, Jonathan L Jacobs. Microbiol Resour Announc. PMID:38289057
    • Rapid characterization of a Delta-Omicron SARS-CoV-2 recombinant detected in Europe. Preprint. https://doi.org/10.21203/rs.3.rs-1502293/v1

Maria Victoria Nugnes, University of Padova, Italy

Maria Victoria is a Research Fellow at the University of Padova and the primary expert curator of the DisProt database. Throughout her career as a biocurator, she has demonstrated exceptional dedication, remarkable skill, and a profound impact on the quality and content growth of the DisProt database over the past three years. Her contributions include the manual curation of over 800 intrinsically disordered proteins (IDPs), more than 4,000 disordered regions, and 1,200 publications (https://apicuron.org/curators/0000-0001-8399-7907). She has also reviewed more than 1,500 disordered regions. Additionally, she contributes to constructing thematic datasets for the characterization of IDPs in biological processes and diseases, including the dataset for the Critical Assessment of IDPs prediction (CAID) – Round 2. She contributes to updating Gene Ontology and IDPs Ontology, with new terms for disorder states and functions. She is co-first author of the latest DisProt publication (Aspromonte MC, Nugnes MV et al., NAR 2023) and a collaborator in defining best practices (Quaglia F et al., Database 2023) for the curation of IDPs in DisProt. Victoria is the Minimal Reporting Requirements Coordinator of the HUPO-PSI IDP Working Group. In addition, she has also made significant contributions to the community of curators, both in their engagement and in their training. She is very careful with their education, constantly involved in virtual and in-person training sessions on curation activities. These include a recorded Spanish DisProt Biocuration course in ELIXIR-SI eLearning platform, Virtual training and Workshop in 4th REFRACT Annual Latin America Visit.

  • Publications
    • Best practices for the manual curation of intrinsically disordered proteins in DisProt. Federica Quaglia, Anastasia Chasapi, Maria Victoria Nugnes, Maria Cristina Aspromonte, Emanuela Leonardi, Damiano Piovesan, Silvio C E Tosatto. Database (Oxford). PMID:38507044
    • PED in 2024: improving the community deposition of structural ensembles for intrinsically disordered proteins. Hamidreza Ghafouri, Tamas Lazar, Alessio Del Conte, Luiggi G Tenorio Ku; PED Consortium; Peter Tompa, Silvio C E Tosatto, Alexander Miguel Monzon. Nucleic Acids. Res. PMID:37904608
    • DisProt in 2024: improving function annotation of intrinsically disordered proteins.
    • Nucleic Acids Res. Maria Cristina Aspromonte, Maria Victoria Nugnes, Federica Quaglia, Adel Bouharoua; DisProt Consortium; Silvio C E Tosatto, Damiano Piovesan. PMID:37904585

Umasri Sankarlal, Freelance Biocurator, India

Her career started in 2004 and the role given to her was to enter data for building comprehensive databases for biomarkers, biological pathways, and chemical compounds from scientific literatures. They were not familiar with “Biocuration” or role of “Biocurator” then, yet she has honed my skills in interpreting curated data and have been a major significant contributor to Fruiteomics, SNP, GeneSeq, Pathway, and neurology-specific databases.
During her career break between 2007 and 2010, she worked for a client as a freelance mentor about the Ontology concept to their developers, shared ideas on developing an ontology for online databases, and supported them with the necessary datasets for their project. She was also working as a consultant with Thomson Reuters, Chennai on their drug Forecasting database.
After her career break, she joined another Bio-IT firm and worked to develop drug target ontology at a higher level that was used in developing a platform for effective information retrieval, extraction, and visualisation from scientific literature. This method and platform were patented, and she is one of the authors. She was one of the contributors to developing “DrugMechDB,” a curated database of drug mechanisms. Presently, she is volunteering as a member of the “ClinGen Intellectual Disability and Autism Gene Curation Working Group panel” and publishing the curated genes on the ClinGen portal.
Even though she switched her career to the IT industry by 2015, she is proud to make significant contributions to Biocuration whenever she gets an opportunity.

  • Publications

Call for Proposals to host the 2026 International Biocuration Conference.

Dear Colleagues,

The Executive Committee of the International Society for Biocuration (ISB) would like to once again invite tenders to host the 19th International Biocuration Conference in Europe during the Northern Spring or Summer of 2026.

Individuals and organizations interested in applying may do so by sending a proposal to the ISB Executive Committee (intsocbio@gmail.com) on or before August 31st, 2024

The successful bidder will be notified by October 1st, 2024. The ISB Executive Committee will publicly announce the selected organization or individuals during the 18th International Biocuration Conference, held in Kansas City, MO, USA in April, 2025.

Format:

Proposals should be short; length should not exceed one side of an A4 or US letter size sheet, using 11 point font. The proposal should contain:

  • The name and institution of the local organizer
  • Details of the proposed venue for at least 150 participants, if the venue has less space please provide plans for hybrid attendance. Typical numbers have not exceeded 350 participants.
  • The range of dates available for the conference. Previous conferences typically have 3-4 days of main conference agenda and 1-2 days of workshops. Dates should not overlap with local holidays.
  • A brief outline of a strategic plan to attract a broad range of participants from the Biocuration community
  • As fair gender representation is positively encouraged by the ISB; we would also like to know how the applicant intends to accomplish this.

In a continued effort to bring our meeting to curators in all geographic regions, we strongly encourage ISB members in Europe and Africa to put forward proposals to bring the ISB meeting to your region once again, or for the first time!

REGIONS ROTATION: 

  • North and South America
  • Europe and Africa
  • Asia and Australasia

This Call for Applications is also available on the ISB website at https://www.biocuration.org/events-and-conferences. For more information about the ISB and our previous conferences, please visit http://www.biocuration.org.

We look forward to hearing from you!

Your colleagues at the ISB Executive Committee.

Announcement for 2023 winners of “Excellence in Biocuration Awards”

We are pleased to announce winners of “Excellence in Biocuration Award” for the year 2023 in two categories:

Charlie is in 3/4 profile playing a guitar with his mouth open singing into a mic with an orange covering. He is wearing a black t-shirt and orange slacks.

Early Career Award – Charles Tapley Hoyt, Laboratory of Systems Pharmacology Harvard Medical School, MA, USA

Charlie has been so amazingly busy in such a short amount of time. He is a Research Fellow at Harvard Medical School and the primary curator, developer, and maintainer of several community datasets and databases.These include Bioregistry, which promotes standardization of prefixes, CURIEs, and URIs when used to reference entities/concepts in the life sciences. He contributes to Biomappings, which provides mappings between named biological entities. He created Chemical Roles Graph, which curates mechanistic relations between small molecules and biological processes, pathways, and diseases in an ontological framework. He is a frequent contributor to other curated datasets, and promotes the concept of the Drive-by Curation and of progressive governance models to enable community curation and strengthen project sustainability.

Charlie actively contributes to community efforts; he is an active member of the OBO Foundry ontology community, focusing on promoting standardization of semantics, better curation and coding practices through continuous integration/continuous development and social workflows, promotes more granular attribution and explicit/transparent licensing to better enable reuse.

He contributes to standards development including the SSSOM standard, a simple standard for sharing ontology mappings that includes explicit semantics and provenance, and is a member of Biological Expression Language (BEL), a domain-specific language for representing causal, correlative, and associative relationships between biomedical entities as well as their associated contextual and provenance annotations.

He has mentored a large number of students at the Institute for Algorithms and Scientific Computing (SCAI) and is frequently available to lend support to public projects such as PyBEL and PyKEEN.

Charlie is actively engaged in the Biocuration community and co-chaired the most recent Biocuration 2023 conference in Padova, Italy and participated in the organizing committee of the Virtual Biocuration 2022 Conference. His charisma, enthusiasm and energy are invaluable to our community. His enthusiasm and energy are rare and he deserves to be celebrated.

Advanced Career Award – Nicolas Matentzoglu, Monarch Initiative, Semanticly, Greece

The head and shoulders of Nico appear in front of a background of blue sky and green trees. Nico is wearing a dark blue shirt.

Nico is celebrated in the bio-ontology and biocuration community as a passionate promoter of open science and a champion of curators and ontology editors. He generously shares his extensive knowledge of semantic and ontology engineering, and works tirelessly to drive complex collaborations involving many different stakeholders.

Nico co-leads the OBO Academy, which brings together extensive yet highly accessible training material on ontologies and related topics through collaboratively authored online material as well as curated seminars, tutorials, and courses. This material has been used extensively by many curators to help them master everything from ontology development to writing queries to retrieve biological data.

Thanks to Nico’s vision and technical oversight, the Ontology Development Kit (ODK) has enabled the editors of dozens of bio-ontologies to utilize powerful automated workflows for maintaining, QC-ing, and releasing their products with ease. The ODK has had a huge positive impact on ontology standardization.

Nico leads the development of the widely used Simple Standard for Sharing Ontological Mappings (SSSOM), involving years of painstaking standards work, driving consensus on key design and modeling issues. He also led the efforts to unify multiple phenotype ontologies (Mammalian (MP), Zebrafish (ZP), Human (HPO), Ontology of Biological Attributes (OBA) through common design patterns.

Nico has recruited and encouraged a diverse range of contributors (researchers, government officials, clinicians as well as ontology developers) to grow and unite our community, promote open science, and provide mentorship. He is the ultimate team player and demonstrates unwavering positive energy and dedication to our community.

Thank you to the Award subcommittee:

  • Nicole Vasilevsky
  • Parul Gupta
  • Susan Bello
  • Ruth Lovering

Many thanks to the ISB members for voting!

Pascale Gaudet and Sandra Orchard – Recipients of the 2023 Exceptional Contribution to Biocuration Award

It is our great pleasure to announce the recipients of the 2023 Exceptional Contribution to Biocuration Award, the voting this year resulted in a tie and thus we have two recipients:

Pascale Gaudet, Swiss Institute of Bioinformatics, Switzerland

Pascale Gaudet has worked in the biocuration field for over 19 years first at DictyBase and more recently NextProt and the Gene Ontology Consortium. Pascale is currently the GO Project Manager and oversees all editorial content. She has worked continually not only to improve the Gene Ontology structure and formalization, but also has driven the project to produce high quality phylogenetically inferred GO annotation using the PAINT annotation system. The PAINT annotations are much more specific than existing annotation from automated sources, because they can be refined on a family-by-family and even gene-by-gene basis. This system is now providing over 3.5 million annotation in the GOA annotation database.

Pascale is working constantly to refine legacy and dormant annotations across the ontology, and with multiple collaborating groups to refine the both ontology and annotation to ensure that both are fit for purpose. She is driving the coordination of overhauls in many areas of GO ontology including multi-species processes, transcription, chromatin remodeling, tacking each are with insight and attention to detail but never failing to see the bigger picture. She has been key to the communication between different interested groups and manages the numerous discussions with efficiency. This is work that almost every bench biologist depends on to some degree, but is largely unrecognized because it depends on thousands of incremental tasks that are not usually attributed or described in publications.

Sandra Orchard, EBI, Hinxton, Cambridge, UK

Sandra has worked tirelessly for the biocuration community for over 20 years. She is currently the Team Leader for Protein Function Content at UniProt (https://www.uniprot.org), and is therefore responsible for a major part of probably the most used biological database in the world. In this role, she also maintains two other key interfaces: the Complex Portal (https://www.ebi.ac.uk/complexportal) and the Enzyme Portal (https://www.ebi.ac.uk/enzymeportal/). Previously, Sandra led the IntAct molecular interaction database (https://www.ebi.ac.uk/intact) and managed the IMEx consortium of collaborating interaction databases. She has also been key in establishing standards within the proteomics community, and has made significant contributions to the InterPro database and the Gene Ontology. Sandra has always been a strong proponent of FAIR principles, education and the biocuration community: she has chaired and/or contributed to numerous biocuration-related committees; she established the first formal educational qualification in biocuration (PgCert at the University of Cambridge); and she has been a long-time supporter of the ISB, serving as treasurer from 2015-2018 and chair from 2018-2020. Sandra has published ~200 papers on biocuration methods, standards and databases, which serves as a measure of her impact and importance both to the biocuration community as well as to the researchers who depend on the many resources to which she has contributed.

Congratulations Pascale and Sandra!

Thank you to the Awards Committee:

  • Nicole Vasilevsky
  • Parul Gupta
  • Susan Bello
  • Ruth Lovering

Many thanks to ISB members for voting!

Posted in Uncategorized