The organelle proteome

Spatial partitioning of biological processes is a phenomenon fundamental to life that enables multiple processes to occur in parallel. An organelle is a sub module of the eukaryotic cell with a specialized function. The name "organelle" stems from the analogy between the role of organelles in the cells to the role of organs in the human body. The precise definition of organelles varies, and these sub modules are sometimes also referred to as compartments or structures of the cell. Often a distinction is made between membrane-bound and non-membrane bound organelles. The membrane-bound organelles, such as the nucleus and the Golgi apparatus, create a physical boundary thus separating the intra and extra-organelle space. In contrast, non-membrane bound organelles like the cytoskeleton and nucleoli provide a specialized surface or region. Membranous or not, this partitioning creates a specific environment at the site of the organelle, where the concentration of different molecules can be tailored to fit the purpose of the organelle.

At the cellular level, the function of proteins is to catalyze, conduct and control most processes at specific times and locations. Subcellular localization of a protein helps to define the protein function as different organelles offer distinct environments containing a variety of physiological conditions, and interaction partners. Consequently, mis-localizations of proteins have often been associated with cellular dysfunction and disease (Kau TR et al, 2004; Laurila K et al, 2009; Park S et al, 2011). Knowledge of the spatial distribution of proteins at a subcellular level is thus essential for understanding protein function, interactions and cellular mechanisms; studying the activity of how cells generate and maintain their spatial organization is central for understanding the mechanisms of the living cell.

Within the Cell Atlas, the subcellular localization of 12073 proteins have been mapped on a single-cell level to 33 subcellular structures and enabled the definition of 13 major organelle proteomes. The localization was performed in a panel of 26 human cell lines using transcriptomics data as a starting point. The analysis further reveals that approximately half of the proteins localize to multiple compartments and identifies many proteins with single-cell variation in terms of protein abundance or spatial distribution. The expression pattern and spatial distribution of human proteins in all major cellular organelles can be explored in these interactive knowledge sections, including numerous catalogues of proteins with specific and similar patterns of expression, as well as examples of detailed images illustrating the subcellular spatial distribution patterns.

Subcellular localization of proteins

Several approaches for systematic analysis of protein localizations have been described. Quantitative mass-spectrometric readouts allow identification of proteins with similar distribution profiles across fractionation gradients (Park S et al, 2011; Christoforou A et al, 2016; Itzhak DN et al, 2016) or enzyme-mediated proximity-labelled proteins in cells (Itzhak DN et al, 2016; Roux KJ et al, 2012; Lee SY et al, 2016). In contrast, imaging-based approaches enable the exploration of subcellular distribution of proteins in situ in single cells and have the advantage of also effectively identifying single-cell variability and multi-organelle localization. Imaging based approaches can be performed using tagged proteins (Huh WK et al, 2003; Simpson JC et al, 2000; Stadler C et al, 2013) or affinity reagents as here in the Human Protein Atlas.

In the Cell Atlas, we employ an immunofluorescence (IF) based approach combined with confocal microscopy to enable high-resolution investigation of the spatial distribution of each protein (Thul PJ et al, 2017; Stadler C et al, 2013; Barbe L et al, 2008; Stadler C et al, 2010; Fagerberg L et al, 2011). With the diffraction-limited resolution of about 200 nm, an immunofluorescence image from the Cell Atlas gives a detailed insight into the cellular organization. The spatial distribution of the protein is investigated using indirect IF in the U-2 OS cell line and up to two additional cell lines selected based on RNA-seq data. The protein of interest is visualized in green, while reference markers for microtubules (red), endoplasmic reticulum (yellow) and nucleus (blue) outline the cell. From small dots like nuclear bodies, to larger structures such as the nucleus, the distinct patterns in the images together with the reference markers make it possible to precisely determine the spatial distribution of a protein within the cell. This enables the assignment of the protein's location to one or more of the 33 structures and substructures currently annotated, as exemplified in Figure 1.

Nucleus Nucleoplasm Nuclear speckles Nuclear bodies Nucleoli Fibrillar center Rim of nucleoli Nuclear membrane Cytosol Cytoplasmic bodies Rods and rings Lipid droplets Aggresome Mitochondria Microtubules Microtubule end MTOC Centrosome Mitotic spindle Cytokinetic bridge Midbody Midbody ring Intermediate filaments Actin filaments Focal adhesions ER Golgi apparatus Vesicles Plasma membrane Cell junctions

Figure 1. Example of confocal immunofluorescence images of different proteins (green) localized to each of the subcellular organelles and substructures currently annotated in the Cell Atlas in a representative set of cell lines. Microtubules are marked with an anti-tubulin antibody (red) and the nucleus is counterstained with DAPI (blue). The side of an image represents 64 μm. For more example images and details describing all the 33 patterns annotated in the Cell Atlas, see the Cell Dictionary.

Protein distribution in the human cell

Figure 2 shows the organelle distribution of all annotations for the 12073 proteins localized to at least one structure or substructure. The plot is sorted by meta-compartments: cytoplasm, nucleus, and secretory machinery, respectively. Most proteins are found in the nucleus, followed by the cytosol and vesicles, which consist of transport vesicles as well as small membrane-bound organelles like endosomes or peroxisomes. 52% (n=6282) of the proteins were detected at more than one location (multilocalizing proteins), and 15% (n=1861) displayed a (single-cell variation) in expression level or spatial distribution. Explore the organelle proteomes of the human cell in detail here.

Figure 2. Bar plot showing the distribution of proteins detected in every organelle, structure and substructure annotated in the Cell Atlas.

Validation of antibodies and location data for the Cell Atlas

Recently, the quality and use of antibodies in research have been frequently debated (Baker M. 2015). As antibody off-target binding can cause false positive results, we have made an effort in manually annotating all results regarding reliability of the staining. In the Cell Atlas a reliability score for every annotated location at a four-graded scale is provided: Enhanced, Supported, Approved, and Uncertain, as described in detail in the assay & annotation section. The enhanced locations are obtained through antibody validation according to one of the validation "pillars" proposed by an international working group (Uhlen M et al, 2016): (i) genetic methods using siRNA silencing (Stadler C et al, 2012) or CRISPR/Cas9 knock-out, (ii) expression of a fluorescent protein-tagged protein at endogenous levels (Skogs M et al, 2016) or (iii) independent antibodies targeting different epitopes (Stadler C et al, 2010). A supportive location is defined by agreement with external experimental data (UniProt database). An approved location score indicates that there is no external experimental information available to confirm the observed location. An uncertain location shows contradictory results compared to complementary information, such as literature or transcriptomics data. Also uncertain locations are shown, since it cannot be ruled out that the data is correct, and further experiments are needed to establish the reliability of the antibody staining. The distribution of reliability scores for the localized proteins is shown in Figure 3. Approximately 46% (n=5503) of the protein localizations provided are enhanced or supported. Table 1 details the organelle distribution of all localized proteins and the distribution of reliability scores on the basis of the individual organelle.

Figure 3. Pie chart showing level of reliability of the localized proteins, where each piece is the number of proteins with one type of score, out of the four reliability scores Enhanced, Supported, Approved, and Uncertain.

Table 1. Table showing the number of proteins localized to every organelle, structure, and substructure in the Cell Atlas, along with the distribution of reliability scores.

Location Proteins Location reliability
% Enhanced Supported Approved Uncertain
Intermediate filaments 1861.5132312723
Actin filaments 2301.9184414919
Focal adhesion sites 1341.11826819
Microtubules 2522.1114716727
Microtubule ends 600420
Cytokinetic bridge 1070.9817757
Midbody 400.318283
Midbody ring 150.101122
Cleavage furrow 300210
Mitotic spindle 350.3512144
Microtubule organizing center 1451.29417916
Centrosome 3382.8187521431
Mitochondria 10748.916333051764
Aggresome 180.100162
Cytosol 426635.339311902358325
Cytoplasmic bodies 600.5321315
Rods & Rings 190.212160
Nucleus 1930161785631078111
Nucleoplasm 378731.458210831886236
Nuclear membrane 2762.3205418616
Nucleoli 10318.510724560574
Nucleoli fibrillar center 2672.2136617216
Nuclear speckles 4523.76112623332
Nuclear bodies 48643916825425
Endoplasmic reticulum 4433.76415121612
Golgi apparatus 9848.27520862378
Vesicles 181515933551215152
Peroxisomes 210.210551
Endosomes 160.110420
Lysosomes 160.161000
Lipid droplets 370.358204
Plasma membrane 153612.7119435851131
Cell Junctions 2972.5269315919
Number of proteins 12073100163940226970962

Relevant links and publications

Baker M. 2015. Reproducibility crisis: Blame it on the antibodies. Nature.
PubMed: 25993940 DOI: 10.1038/521274a

Barbe L et al, 2008. Toward a confocal subcellular atlas of the human proteome. Mol Cell Proteomics.
PubMed: 18029348 DOI: 10.1074/mcp.M700325-MCP200

Christoforou A et al, 2016. A draft map of the mouse pluripotent stem cell spatial proteome. Nat Commun.
PubMed: 26754106 DOI: 10.1038/ncomms9992

Fagerberg L et al, 2011. Mapping the subcellular protein distribution in three human cell lines. J Proteome Res.
PubMed: 21675716 DOI: 10.1021/pr200379a

Foster LJ et al, 2006. A mammalian organelle map by protein correlation profiling. Cell.
PubMed: 16615899 DOI: 10.1016/j.cell.2006.03.022

Huh WK et al, 2003. Global analysis of protein localization in budding yeast. Nature.
PubMed: 14562095 DOI: 10.1038/nature02026

Itzhak DN et al, 2016. Global, quantitative and dynamic mapping of protein subcellular localization. Elife.
PubMed: 27278775 DOI: 10.7554/eLife.16950

Kau TR et al, 2004. Nuclear transport and cancer: from mechanism to intervention. Nat Rev Cancer.
PubMed: 14732865 DOI: 10.1038/nrc1274

Laurila K et al, 2009. Prediction of disease-related mutations affecting protein localization. BMC Genomics.
PubMed: 19309509 DOI: 10.1186/1471-2164-10-122

Lee SY et al, 2016. APEX Fingerprinting Reveals the Subcellular Localization of Proteins of Interest. Cell Rep.
PubMed: 27184847 DOI: 10.1016/j.celrep.2016.04.064

Park S et al, 2011. Protein localization as a principal feature of the etiology and comorbidity of genetic diseases. Mol Syst Biol.
PubMed: 21613983 DOI: 10.1038/msb.2011.29

Roux KJ et al, 2012. A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells. J Cell Biol.
PubMed: 22412018 DOI: 10.1083/jcb.201112098

Simpson JC et al, 2000. Systematic subcellular localization of novel proteins identified by large-scale cDNA sequencing. EMBO Rep.
PubMed: 11256614 DOI: 10.1093/embo-reports/kvd058

Skogs M et al, 2017. Antibody Validation in Bioimaging Applications Based on Endogenous Expression of Tagged Proteins. J Proteome Res.
PubMed: 27723985 DOI: 10.1021/acs.jproteome.6b00821

Stadler C et al, 2012. Systematic validation of antibody binding and protein subcellular localization using siRNA and confocal microscopy. J Proteomics.
PubMed: 22361696 DOI: 10.1016/j.jprot.2012.01.030

Stadler C et al, 2013. Immunofluorescence and fluorescent-protein tagging show high correlation for protein localization in mammalian cells. Nat Methods. 2013 Apr;10(4):315-23
PubMed: 23435261 DOI: 10.1038/nmeth.2377

Stadler C et al, 2010. A single fixation protocol for proteome-wide immunofluorescence localization studies. J Proteomics.
PubMed: 19896565 DOI: 10.1016/j.jprot.2009.10.012

Thul PJ et al, 2017. A subcellular map of the human proteome. Science.
PubMed: 28495876 DOI: 10.1126/science.aal3321

Uhlen M et al, 2016. A proposal for validation of antibodies. Nat Methods.
PubMed: 27595404 DOI: 10.1038/nmeth.3995