Human Proteome Project
Bioinformatics Seminar Series
Dr. John Bergeron
Abstract: The objective of a human proteome project is to characterize quantitatively a representative protein for each protein coding gene in its major organ and intracellular site of expression in the context of its major protein partners. The concept of a single representative protein for each protein coding gene enables a gene centric approach to be implemented for the completion of the human proteome. A recently annotated human genome suggests ca 20,500 protein coding genes of which ca 8,000 have no direct evidence for a protein expressed and a further 5,000 where the evidence may be weak. Hence a human proteome project will provide the missing evidence for protein expression by predicted protein coding genes. Current efforts to define the proteome using antibodies monospecific to a representative protein for each protein coding gene are ongoing with almost a third of the human genome mapped in this way via the Human Protein Atlas. Current mass spectrometry efforts have enabled community standards to be implemented via test samples. The extension to label free methods to quantify proteins by tandem mass spectrometry as well as their quantitation using a serial dilution of heavy isotope labelled peptides diagnostic of each representative protein provides the framework to define protein abundance for each representative protein in its major site of expression. Using both tagged genes for expression in mice and human cell lines, a protein interaction map may be defined for each representative protein and confirmed by coimmunoprecipitations with a polyclonal antibody resource followed by tandem mass spectrometry to characterize protein partners. Computational biology to merge the data from the antibody directed mapping of the proteome along with the quantitative estimates of protein abundance via tandem mass spectrometry and protein partners also deduced by tandem mass spectrometry defines the resource of a human proteome.