Biocuration: On the Front Line of Structural Biology
As they work from home during the COVID-19 pandemic, the RCSB PDB Biocuration Team reflects on the privilege and responsibility of being a biocurator.
Our diverse RCSB PDB biocuration team represents advanced training in X-ray crystallography, NMR spectrometry, and 3DEM. Collectively, we have expertise in biochemistry, biophysics, computational chemistry, enzymology, and small molecule crystallography.
We biocurators see ourselves at the very front line of structural biology. Our job is to ensure the quality of the data in the PDB archive. As partners in the Worldwide Protein Data Bank, we are responsible for reviewing structure quality, standardizing format, and adding the metadata annotations that enable research by our PDB users. Biocurators are the connection between the scientists who submit PDB structures and the public who appreciate them. Our ultimate goal is to support research, training, and education worldwide and facilitate public awareness and understanding of these scientific discoveries.
Together with our collaborators at PDBe and PDBj, the RCSB PDB Biocuration team has reviewed, annotated, and released many milestone structures.
In 2008, we celebrated the Nobel Prize for green fluorescent protein. Dozens of new GFP and GFP-like PDB structures are deposited and annotated each year.
In 2009, the Nobel Prize for ribosome structures reminded biocurators of the importance of reviewing each of the many chains of the ribosomal RNA and its associated ribosomal proteins.
When we received the first XFEL structure 3pcq in 2010, we were astonished to see that more than 15,000 crystals were involved. This motivated the team to learn more about this technique and to improve our deposition system to capture statistics unique to the method.
In 2012, we celebrated the Nobel Prize for G Protein-Coupled Receptors (GPCR). GPCRs are a large family of membrane-embedded receptors, with structural features that have been preserved through the course of evolution. GPCRs are at the center of signaling pathways that control all manner of essential processes, ranging from vision to carcinogenesis, and thus are important targets for therapeutic intervention.
In 2013, we received the structure of the HIV-1 capsid (3j3q), the structure containing the most atoms (over 2.4 million) ever deposited, and it was so large days were required for the team to process.
In January 2020, just days after 10 million people in Wuhan City were quarantined, we received the structure deposition of the COVID-19 main protease, and worked rapidly to enable public access to this structure of a potential drug target. Since then, more than 150 SARS-CoV-2 structures have been deposited and carefully curated by biocurators around the world. Biocuration of these structures are a priority, as they are being used to develop new drugs and new approaches to creating vaccines to fight coronaviruses.
We are biocurators with both sense and sensibility--we have a passion for structural biology and honor the responsibility of our role. Macromolecular structures are exquisite in their utility and intricacy, and we appreciate the privilege of our early access to their biological stories. We admire both the artistry of nature and the creativity and persistence of the scientists who have observed and recorded these structures over the nearly fifty years of PDB archiving. Biocuration is a vocation that requires knowledge, propriety, thoughtfulness, and carefulness, but most importantly it requires respect for data. It is a pleasure to work with our depositor community, and to share their work with the world.
We don’t know how Science will evolve in the years to come, but we are certain that it will be exciting, that there will be more scientific data to curate, and that we are going to do our best to serve the community.
For more on biocuration, see The rewards of working as a data wrangler by Maggie Kuo
Science Careers (2017) doi: 10.1126/science.caredit.aaq0481