Metrics Describing Progress of the Protein Structure Initiative
Updated December 01, 2011
-
I. Progress of the PSI
II. Number of Experimental Structures and Residues
III. Impact and Classification of Structures
IV. Novel Modeling Leverage
V. Biological Theme Targets & Structures
- Biological Theme Targets & Structures by PSI Center
- Biomedical Targets & Structures by PSI Center
- Metagenomic Targets & Structures by PSI Center
- Community Nominated Targets & Structures by PSI Center
Glossary of Terms
- BIG: BioInformatics Group. A team of bioinformaticians from PSI-2 large-scale production centers that coordinate efforts of target selection and progress evaluation
- ALL-PSI: All PSI centers (PSI-1, PSI-2, PSI-3)
- ALL PDB: Entire Protein Data Bank
- PSI-1: The pilot phase of the protein structure initiative (PSI) ran from 09-01-2000 to 06-30-2005
- PSI-2: The production phase of the protein structure initiative (PSI) (07-01-2005 - ongoing). Statistics on this page reflect PSI-2 data deposited after 07-01-2005 and released before 12-01-2011
- PSI-3: The PSI:Biology program of the protein structure initiative (PSI) (07-01-2010 - ongoing). Statistics on this page reflect PSI-3 data deposited after 07-01-2010 and released before 12-01-2011
- LSC: are the four Protein Structure Initiative (PSI) large-scale production centers; namely MCSG, JCSG, NESG, and NYSGXRC/NYSGRC
- Total Structures: Total number of structures in the PDB at the time of deposition. This includes multiple structures of the same protein sequence determined by different methods (i.e., NMR versus X-ray crystallography), in different crystal forms, different solution conditions, or bound to different ligands.
- Distinct Structures: Total number of structures with non-redundant sequences less then 98% sequence identity
- Distinct Residues: Total number of residues in structures with non-redundant sequences less 98% sequence identity
- Novel Structures: Total number of novel structures with less than 30% sequence identity to an existing structure at the time of PDB deposition
- Novel Residues: Total number of residues in structures with less than 30% sequence identity to an existing structure at the time of PDB deposition
- X-Ray Structures: Total number of structures determined using X-Ray crystallography
- NMR Structures: Total number of structures determined using NMR methods
- Membrane Proteins: Total number of structures of membrane proteins
- Eukaryotic Proteins: Total number of structures from eukaryotic organisms
- Prokaryotic Proteins: Total number of structures from prokaryotic organisms
- Human Proteins: Total number of structures of human proteins
- Other Proteins: Total number of structures of viral and unknown source proteins
I. Progress of the PSI
PDB deposition statistics
| PSI grant period | LSC | ALL-PSI |
| PSI-2 grant year 2005 | 390 | 427 |
| PSI-2 grant year 2006 | 620 | 700 |
| PSI-2 grant year 2007 | 682 | 718 |
| PSI-2 grant year 2008 | 796 | 818 |
| PSI-2 grant year 2009 | 822 | 879 |
| PSI-2 grant extension | 217 | 244 |
| PSI-2 total depositions | 3527 | 3786 |
| PSI-3 grant year 2010 | 280 | 283 |
| PSI-3 total depositions | 377 | 381 |
| PSI-1 + PSI-2 + PSI-3 | 4804 | 5583 |
PSI-2 grant year 2005:
PSI-2 structures deposited between July 1, 2005 and July 1, 2006
PSI-2 grant year 2006:
PSI-2 structures deposited between July 1, 2006 and July 1, 2007
PSI-2 grant year 2007:
PSI-2 structures deposited between July 1, 2007 and July 1, 2008
PSI-2 grant year 2008:
PSI-2 structures deposited between July 1, 2008 and July 1, 2009
PSI-2 grant year 2009:
PSI-2 structures deposited between July 1, 2009 and July 1, 2010
PSI-2 grant extension:
PSI-2 structures deposited between July 1, 2010 and December 01, 2011
PSI-2 total depositions:
PSI-2 structures deposited between July 1, 2005 and December 01, 2011
PSI-3 grant year 2010:
PSI-3 structures deposited between July 1, 2010 and July 1, 2011
PSI-3 total depositions:
PSI-3 structures deposited between July 1, 2010 and December 01, 2011
PSI-1 + PSI-2 + PSI-3:
All PSI structures released before December 01, 2011
Numbers of experimental structures from PSI
Start of PSI Deposition:
Sep 2000: Structures deposited before PSI grant initiation (2000-09-01)
Each subsequent bar shows the total number of PSI structures deposited as of the First of each indicated date.
For example, there were 96 PSI-derived structures deposited as of September 1, 2001.
II. Number of Experimental Structures and Residues
Structures determined in PSI-1
| Center | Total Structures |
Distinct Structures |
Distinct Residues |
Novel Structures |
Novel Residues |
| LSC | 898 | 792 | 180109 | 453 | 99731 |
| ALL-PSI-1 | 1416 | 1157 | 265929 | 622 | 137025 |
| ALL PDB | 19865 | 7111 | 1704005 | 2735 | 645949 |
Calculations in this table are based on PDB data deposited before July 1, 2005
Structures determined in PSI-2
| Center | Total Structures |
Distinct Structures |
Distinct Residues |
Novel Structures |
Novel Residues |
| LSC | 3527 | 3166 | 719603 | 1957 | 441959 |
| ALL-PSI-2 | 3786 | 3271 | 744587 | 2018 | 456021 |
| ALL PDB | 43100 | 15604 | 3975602 | 5922 | 1435827 |
Calculations in this table are based on PDB data deposited after July 1, 2005 and released before December 01, 2011.
III. Impact and Classification of PSI-2 Structures
Calculations in this section are based on PDB data deposited after July 1, 2005 and released before December 01, 2011.
Classification of PSI-2 structures
| Center | Total Structures |
X-ray | NMR | Membrane Proteins |
Eukary- otes |
Human | Other | Prokary- otes |
| LSC | 3527 | 3159 | 368 | 24 | 185 | 100 | 46 | 3274 |
| ALL-PSI-2 | 3786 | 3380 | 406 | 81 | 345 | 147 | 58 | 3360 |
Classification of PSI-2 structures by organism
| Eukaryotes Total | 345 |
| Prokaryotes Total | 3360 |
| Other Organisms Total | 58 |
Detailed PSI-2 structure counts for eukaryotes
| Eukaryotes Total | 345 |
| Aequorea victoria | 1 |
| Anopheles gambiae | 2 |
| Antirrhinum majus | 1 |
| Arabidopsis thaliana | 59 |
| Ashbya gossypii | 2 |
| Aspergillus fumigatus | 1 |
| Aspergillus oryzae | 2 |
| Babesia bovis | 1 |
| Bos taurus | 3 |
| Brugia malayi | 1 |
| Caenorhabditis elegans | 11 |
| Candida albicans | 1 |
| Candida glabrata | 1 |
| Coccidioides immitis | 1 |
| Cyanidioschyzon merolae | 1 |
| Danio rerio | 7 |
| Drosophila melanogaster | 3 |
| Encephalitozoon cuniculi | 4 |
| Engyodontium album | 1 |
| Galdieria sulphuraria | 7 |
| Gibberella zeae | 2 |
| Homo sapiens | 147 |
| Mus musculus | 35 |
| Oncorhynchus mykiss | 1 |
| Oryza sativa | 1 |
| Pentadiplandra brazzeana | 1 |
| Pichia guilliermondii | 1 |
| Rana pipiens | 2 |
| Rattus norvegicus | 5 |
| Saccharomyces cerevisiae | 30 |
| Schizosaccharomyces pombe | 2 |
| Solanum lycopersicum | 1 |
| Spinacia oleracea | 1 |
| Sus scrofa | 1 |
| Toxoplasma gondii | 3 |
| Trypanosoma brucei | 1 |
| Xenopus laevis | 1 |
Detailed PSI-2 structure counts for prokaryotes
| Prokaryotes Total | 3360 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
Detailed PSI-2 structure counts for other organisms
| Other Organisms Total | 58 |
| Chimera | 1 |
| Rattus norvegicus, saccharomyces cerevisiae | 1 |
| Unknown | 21 |
| Artificial gene | 4 |
| Uncultured marine organism | 6 |
| Unidentified | 11 |
| Virus | 36 |
| Bacteriophage p1 | 1 |
| Homo sapiens, bacteriophage t4 | 1 |
| Homo sapiens, enterobacteria phage t4 | 9 |
| Homo sapiens, enterobacteria phage t4, homo sapiens | 1 |
| Influenza a virus | 2 |
| Mengo virus | 1 |
| Moloney murine leukemia virus | 1 |
| Murid herpesvirus 4 | 2 |
| Pseudomonas phage luz7 | 1 |
| Pseudomonas phage phi12 | 1 |
| Pseudomonas phage yua | 1 |
| Sars coronavirus | 13 |
| Staphylococcus phage | 1 |
| Vaccinia virus wr | 1 |
IV. Novel Modeling Leverage
Calculations in this section updated October 5, 2011
Modeling leverage caculations are provided by Dr.Lukasz Jaroszewski affiliated with JCSG
Leverage provided by PSI-2 structures
| Center | Total Leverage |
Residue Leverage |
Novel Leverage |
Residue Leverage |
| LSC | 1233729 | 304119628 | 250016 | 64765245 |
| PSI-2 | 1462817 | 352173350 | 266758 | 69997509 |
| PDB excluding PSI-2 | 3828929 | 1115315109 | 635537 | 241783799 |
V. Biological Theme Targets & Structures
Calculations in this section updated December 01, 2011
Biological Theme Targets by PSI Center
| Center | Biomedical Targets |
Metagenomics Targets |
Community Nominated Targets |
| ATCG3D | 7 | 0 | 0 |
| CESG | 983 | 0 | 442 |
| CHTSB | 129 | 0 | 0 |
| GPCR | 0 | 0 | 3 |
| ISFI | 0 | 0 | 428 |
| JCSG | 21843 | 2436 | 2078 |
| MCSG | 4318 | 1500 | 1945 |
| MPP | 232 | 0 | 284 |
| NESG | 8561 | 1847 | 811 |
| NYCOMPS | 0 | 0 | 532 |
| NYSGRC | 9593 | 0 | 302 |
| NYSGXRC | 1335 | 485 | 1489 |
| TMPC | 48 | 0 | 49 |
| Total | 54744 | 6268 | 11410 |
Biomedical Targets & Structures by PSI Center
| Center | Total Targets |
Cloned Targets |
Expressed Targets |
Purified Targets |
Crystallyzed Targets |
NMR Targets |
Targets In PDB |
Structures In PDB |
| ATCG3D | 7 | 7 | 7 | 7 | 6 | 0 | 6 | 8 |
| CESG | 983 | 770 | 689 | 173 | 32 | 3 | 10 | 8 |
| CHTSB | 129 | 124 | 29 | 10 | 0 | 0 | 5 | 21 |
| JCSG | 21843 | 21108 | 12469 | 12467 | 1857 | 0 | 738 | 798 |
| MCSG | 4318 | 3515 | 2595 | 1265 | 287 | 0 | 276 | 304 |
| MPP | 232 | 51 | 47 | 5 | 0 | 0 | 0 | 0 |
| NESG | 8561 | 2578 | 2418 | 736 | 50 | 89 | 129 | 141 |
| NYSGRC | 9593 | 2089 | 1794 | 481 | 51 | 0 | 41 | 43 |
| NYSGXRC | 1335 | 1010 | 911 | 462 | 107 | 0 | 69 | 77 |
| TMPC | 48 | 32 | 26 | 10 | 0 | 0 | 0 | 0 |
| Total | 54744 | 35893 | 24306 | 16833 | 2390 | 92 | 1274 | 1711 |
Metagenomic Targets & Structures by PSI Center
| Center | Total Targets |
Cloned Targets |
Expressed Targets |
Purified Targets |
Crystallyzed Targets |
NMR Targets |
Targets In PDB |
Structures In PDB |
| JCSG | 2436 | 2348 | 1625 | 1625 | 387 | 0 | 113 | 116 |
| MCSG | 1500 | 1430 | 860 | 323 | 70 | 0 | 64 | 71 |
| NESG | 1847 | 1127 | 1123 | 409 | 45 | 28 | 67 | 77 |
| NYSGXRC | 485 | 451 | 396 | 207 | 52 | 0 | 31 | 35 |
| Total | 6268 | 5356 | 4004 | 2564 | 554 | 28 | 275 | 299 |
Community Nominated Targets & Structures by PSI Center
| Center | Total Targets |
Cloned Targets |
Expressed Targets |
Purified Targets |
Crystallyzed Targets |
NMR Targets |
Targets In PDB |
Structures In PDB |
| CESG | 442 | 313 | 243 | 81 | 39 | 12 | 30 | 40 |
| GPCR | 3 | 3 | 0 | 0 | 0 | 0 | 0 | 0 |
| ISFI | 428 | 330 | 244 | 210 | 175 | 0 | 129 | 13 |
| JCSG | 2078 | 1992 | 1573 | 1573 | 363 | 0 | 111 | 113 |
| MCSG | 1945 | 1614 | 1304 | 581 | 128 | 0 | 49 | 50 |
| MPP | 284 | 180 | 165 | 28 | 0 | 0 | 0 | 0 |
| NESG | 811 | 564 | 523 | 292 | 49 | 38 | 83 | 97 |
| NYCOMPS | 532 | 362 | 90 | 43 | 0 | 0 | 0 | 0 |
| NYSGRC | 302 | 302 | 297 | 56 | 17 | 0 | 11 | 14 |
| NYSGXRC | 1489 | 1288 | 1066 | 514 | 174 | 0 | 128 | 178 |
| TMPC | 49 | 41 | 27 | 8 | 0 | 0 | 0 | 0 |
| Total | 11410 | 8555 | 6572 | 3808 | 945 | 50 | 541 | 589 |
VI. PSI PFAM Domain Family Coverage
Calculations in this section updated October 5, 2011
Number of PFAM families for which PSI provided the first structure representative
| Total PFAM Families |
Total PFAM Families in PDB |
PSI-1 PFAM Families |
PSI-2 PFAM Families |
| 12273 | 5564 | 355 | 561 |