Cancer-related genes FDA approved drug targets Predicted intracellular proteins Transcription factors
All transcripts of all genes have been analyzed regarding the location(s) of corresponding protein based on prediction methods for signal peptides and transmembrane regions.
Genes with at least one transcript predicted to encode a secreted protein, according to prediction methods or to UniProt location data, have been further annotated and classified with the aim to determine if the corresponding protein(s) are secreted or actually retained in intracellular locations or membrane-attached.
Remaining genes, with no transcript predicted to encode a secreted protein, will be assigned the prediction-based location(s).
The annotated location overrules the predicted location, so that a gene encoding a predicted secreted protein that has been annotated as intracellular will have intracellular as the final location.
The RNA specificity category is based on mRNA expression levels in the consensus dataset which is calculated from the RNA expression levels in samples from HPA and GTEX. The categories include: tissue enriched, group enriched, tissue enhanced, low tissue specificity and not detected.
Evidence score for genes based on UniProt protein existence (UniProt evidence); a Human Protein Atlas antibody- or RNA based score (HPA evidence); and evidence based on PeptideAtlas (MS evidence). The avaliable scores are evidence at protein level, evidence at transcript level, no evidence, or not avaliable.
A summary of the overall protein expression pattern across the analyzed normal tissues. The summary is based on knowledge-based annotation.
"Estimation of protein expression could not be performed. View primary data." is shown for genes analyzed with a knowledge-based approach where available RNA-seq and gene/protein characterization data has been evaluated as not sufficient in combination with immunohistochemistry data to yield a reliable estimation of the protein expression profile.
Nuclear expression in several tissues, mostly in a fraction of the cells.
IMMUNOHISTOCHEMISTRY DATA RELIABILITY
Data reliability descriptioni
Standardized explanatory sentences with additional information required for full understanding of the protein expression profile, based on knowledge-based and secretome-based annotation.
Antibody staining mainly consistent with RNA expression data.
A reliability score is manually set for all genes and indicates the level of reliability of the analyzed protein expression pattern based on available RNA-seq data, protein/gene characterization data and immunohistochemical data from one or several antibodies with non-overlapping epitopes. The reliability score is based on the 44 normal tissues analyzed, and if there is available data from more than one antibody, the staining patterns of all antibodies are taken into consideration during evaluation.
The reliability score is divided into Enhanced, Supported, Approved, or Uncertain, and is displayed on both Tissue Atlas and Pathology Atlas.
Below is an overview of RNA and protein expression data generated in the Human Protein Atlas project. Analyzed tissues are divided into color-coded groups according to which functional features they have in common. For each group, a list of included tissues is accessed by clicking on group name, group symbol, RNA bar, or protein bar. Subsequent selection of a particular tissue in this list links to the image data page.
Images of selected tissues give a visual summary of the protein expression profile furthest to the right.
The gray human body provides links to a histology dictionary when clicking on any part of the figure.
RNA expression (TPM)i
RNA expression summary shows the consensus data based on normalized expression (nTPM) values from two different sources: internally generated Human Protein Atlas (HPA) RNA-seq data and RNA-seq data from the Genotype-Tissue Expression (GTEx) project. Color-coding is based on tissue groups, each consisting of tissues with functional features in common. To access sample data, click on tissue name or bar.
Each bar represents the highest expression score found in a particular group of tissues. Protein expression scores are based on a best estimate of the "true" protein expression from a knowledge-based annotation, described more in detail under Assays & annotation. For genes where more than one antibody has been used, a collective score is set displaying the estimated true protein expression.
Protein expression data is shown for each of the 44 tissues. Color-coding is based on tissue groups, each consisting of tissues with functional features in common. Mouse-over function shows protein score for analyzed cell types in a selected tissue. To access image data click on tissue name or bar. Annotation of protein expression is described in detail in Assays & annotation.
For genes with available protein data for which a knowledge-based annotation gave inconclusive results, no protein expression data is displayed in the protein expression data overview. However, all immunohistochemical images are still available and the annotation data can be found under Primary data.
RNA EXPRESSION OVERVIEWi
RNA expression overview shows RNA-data from two different sources: Internally generated Human Protein Atlas (HPA) RNA-seq data and RNA-seq data from the Genotype-Tissue Expression (GTEx) project, as well as the consensus dataset which is based on a combination of both sources. Color-coding is based on tissue groups, each consisting of tissues with functional features in common. To access sample data, click on tissue name or bar.
The HPA RNA-seq tissue data is reported as nTPM (normalized protein-coding transcripts per million), corresponding to mean values of the different individual samples from each tissue. Color-coding is based on tissue groups, each consisting of tissues with functional features in common. To access sample data, click on tissue name or bar.
The RNA-seq tissue data generated by the Genotype-Tissue Expression (GTEx) project is reported as nTPM (normalized protein-coding transcripts per million), corresponding to mean values of the different individual samples from each tissue. Color-coding is based on tissue groups, each consisting of tissues with functional features in common. To access sample data, click on tissue name or bar.
The tissue data for RNA expression obtained through Cap Analysis of Gene Expression (CAGE) generated by the FANTOM5 project are reported as Scaled Tags Per Million. Color-coding is based on tissue groups, each consisting of tissues with functional features in common. To access sample data, click on tissue name or bar.
Gene information from Ensembl and Entrez, as well as links to available gene identifiers are displayed here. Information was retrieved from Ensembl if not indicated otherwise.
JUN (HGNC Symbol)
Jun proto-oncogene, AP-1 transcription factor subunit (HGNC Symbol)
Entrez gene summary
This gene is the putative transforming gene of avian sarcoma virus 17. It encodes a protein which is highly similar to the viral protein, and which interacts directly with specific target DNA sequences to regulate gene expression. This gene is intronless and is mapped to 1p32-p31, a chromosomal region involved in both translocations and deletions in human malignancies. [provided by RefSeq, Jul 2008]
The protein browser displays the antigen location on the target protein(s) and the features of the target protein. The tabs at the top of the protein view section can be used to switch between the different splice variants to which an antigen has been mapped.
At the top of the view, the position of the antigen (identified by the corresponding HPA identifier) is shown as a green bar. A yellow triangle on the bar indicates a <100% sequence identity to the protein target.
Below the antigens, the maximum percent sequence identity of the protein to all other proteins from other human genes is displayed, using a sliding window of 10 aa residues (HsID 10) or 50 aa residues (HsID 50). The region with the lowest possible identity is always selected for antigen design, with a maximum identity of 60% allowed for designing a single-target antigen (read more).
The curve in blue displays the predicted antigenicity i.e. the tendency for different regions of the protein to generate an immune response, with peak regions being predicted to be more antigenic.The curve shows average values based on a sliding window approach using an in-house propensity scale. (read more).
If a signal peptide is predicted by a majority of the signal peptide predictors SPOCTOPUS, SignalP 4.0, and Phobius (turquoise) and/or transmembrane regions (orange) are predicted by MDM, these are displayed.
Low complexity regions are shown in yellow and InterPro regions in green. Common (purple) and unique (grey) regions between different splice variants of the gene are also displayed (read more), and at the bottom of the protein view is the protein scale.
The protein information section displays alternative protein-coding transcripts (splice variants) encoded by this gene according to the Ensembl database.
The ENSP identifier links to the Ensembl website protein summary, while the ENST identifier links to the Ensembl website transcript summary for the selected splice variant. The data in the UniProt column can be expanded to show links to all matching UniProt identifiers for this protein.
The protein classes assigned to this protein are shown if expanding the data in the protein class column. Parent protein classes are in bold font and subclasses are listed under the parent class.
The Gene Ontology terms assigned to this protein are listed if expanding the Gene ontology column. The length of the protein (amino acid residues according to Ensembl), molecular mass (kDalton), predicted signal peptide (according to a majority of the signal peptide predictors SPOCTOPUS, SignalP 4.0, and Phobius) and the number of predicted transmembrane region(s) (according to MDM) are also reported.
Predicted intracellular proteins Transcription factors Basic domains Cancer-related genes Candidate cancer biomarkers COSMIC somatic mutations in cancer genes COSMIC Amplifications COSMIC Somatic Mutations FDA approved drug targets Small molecule drugs Protein evidence (Kim et al 2014) Protein evidence (Ezkurdia et al 2014)
GO:0000228 [nuclear chromosome] GO:0000790 [nuclear chromatin] GO:0000978 [RNA polymerase II core promoter proximal region sequence-specific DNA binding] GO:0000980 [RNA polymerase II distal enhancer sequence-specific DNA binding] GO:0000981 [RNA polymerase II transcription factor activity, sequence-specific DNA binding] GO:0000982 [transcription factor activity, RNA polymerase II core promoter proximal region sequence-specific binding] GO:0001077 [transcriptional activator activity, RNA polymerase II core promoter proximal region sequence-specific binding] GO:0001102 [RNA polymerase II activating transcription factor binding] GO:0001190 [transcriptional activator activity, RNA polymerase II transcription factor binding] GO:0001525 [angiogenesis] GO:0001774 [microglial cell activation] GO:0001836 [release of cytochrome c from mitochondria] GO:0001889 [liver development] GO:0001938 [positive regulation of endothelial cell proliferation] GO:0003151 [outflow tract morphogenesis] GO:0003677 [DNA binding] GO:0003682 [chromatin binding] GO:0003690 [double-stranded DNA binding] GO:0003700 [transcription factor activity, sequence-specific DNA binding] GO:0003705 [transcription factor activity, RNA polymerase II distal enhancer sequence-specific binding] GO:0003713 [transcription coactivator activity] GO:0003723 [RNA binding] GO:0005096 [GTPase activator activity] GO:0005515 [protein binding] GO:0005634 [nucleus] GO:0005654 [nucleoplasm] GO:0005667 [transcription factor complex] GO:0005719 [nuclear euchromatin] GO:0005829 [cytosol] GO:0006351 [transcription, DNA-templated] GO:0006355 [regulation of transcription, DNA-templated] GO:0006366 [transcription from RNA polymerase II promoter] GO:0007179 [transforming growth factor beta receptor signaling pathway] GO:0007184 [SMAD protein import into nucleus] GO:0007265 [Ras protein signal transduction] GO:0007568 [aging] GO:0007612 [learning] GO:0007623 [circadian rhythm] GO:0008134 [transcription factor binding] GO:0008284 [positive regulation of cell proliferation] GO:0008285 [negative regulation of cell proliferation] GO:0009314 [response to radiation] GO:0009612 [response to mechanical stimulus] GO:0009987 [cellular process] GO:0010033 [response to organic substance] GO:0010634 [positive regulation of epithelial cell migration] GO:0010941 [regulation of cell death] GO:0014070 [response to organic cyclic compound] GO:0017053 [transcriptional repressor complex] GO:0019899 [enzyme binding] GO:0030224 [monocyte differentiation] GO:0031103 [axon regeneration] GO:0031953 [negative regulation of protein autophosphorylation] GO:0032496 [response to lipopolysaccharide] GO:0032870 [cellular response to hormone stimulus] GO:0033613 [activating transcription factor binding] GO:0034097 [response to cytokine] GO:0035026 [leading edge cell differentiation] GO:0035497 [cAMP response element binding] GO:0035994 [response to muscle stretch] GO:0038095 [Fc-epsilon receptor signaling pathway] GO:0042127 [regulation of cell proliferation] GO:0042493 [response to drug] GO:0042542 [response to hydrogen peroxide] GO:0042802 [identical protein binding] GO:0042803 [protein homodimerization activity] GO:0043065 [positive regulation of apoptotic process] GO:0043066 [negative regulation of apoptotic process] GO:0043392 [negative regulation of DNA binding] GO:0043524 [negative regulation of neuron apoptotic process] GO:0043525 [positive regulation of neuron apoptotic process] GO:0043547 [positive regulation of GTPase activity] GO:0043565 [sequence-specific DNA binding] GO:0043922 [negative regulation by host of viral transcription] GO:0043923 [positive regulation by host of viral transcription] GO:0044212 [transcription regulatory region DNA binding] GO:0045597 [positive regulation of cell differentiation] GO:0045657 [positive regulation of monocyte differentiation] GO:0045740 [positive regulation of DNA replication] GO:0045892 [negative regulation of transcription, DNA-templated] GO:0045893 [positive regulation of transcription, DNA-templated] GO:0045944 [positive regulation of transcription from RNA polymerase II promoter] GO:0046982 [protein heterodimerization activity] GO:0048146 [positive regulation of fibroblast proliferation] GO:0048661 [positive regulation of smooth muscle cell proliferation] GO:0051090 [regulation of sequence-specific DNA binding transcription factor activity] GO:0051365 [cellular response to potassium ion starvation] GO:0051591 [response to cAMP] GO:0051726 [regulation of cell cycle] GO:0051899 [membrane depolarization] GO:0060395 [SMAD protein signal transduction] GO:0061029 [eyelid development in camera-type eye] GO:0070374 [positive regulation of ERK1 and ERK2 cascade] GO:0070412 [R-SMAD binding] GO:0071277 [cellular response to calcium ion] GO:0071837 [HMG box domain binding] GO:1902895 [positive regulation of pri-miRNA transcription from RNA polymerase II promoter] GO:1990441 [negative regulation of transcription from RNA polymerase II promoter in response to endoplasmic reticulum stress] GO:2000144 [positive regulation of DNA-templated transcription, initiation]