Downloadable data


  Programmatic access
If you want to programmatically access a subset of the data more information can be found on the help page
 
  Search results
The data files represented here includes data available in the Human Protein Atlas version 18.1. A subset of this data can also be downloaded from the Search page with the genes corresponding to the current search result in the result in different formats; XML, RDF & TSV.
 
  Single entry
Data for a single entry can be accessed in XML, RDF (trig) or TSV format by adding the corresponding format extension to the ensembl id as in the below URLs:
http://www.proteinatlas.org/ENSG00000134057.xml
http://www.proteinatlas.org/ENSG00000134057.trig
http://www.proteinatlas.org/ENSG00000134057.tsv

 
  Archived data
As of version 13 of the Human Protein Atlas, the site can be reached using the url structure "http://vXX.proteinatlas.org" where XX is the version number. For example, version 13 of the Human Protein Atlas has the url http://v13.proteinatlas.org.

 
1 Normal tissue data
Expression profiles for proteins in human tissues based on immunohistochemisty using tissue micro arrays. The tab-separated file includes Ensembl gene identifier ("Gene"), tissue name ("Tissue"), annotated cell type ("Cell type"), expression value ("Level"), and the gene reliability of the expression value ("Reliability"). The data is based on The Human Protein Atlas version 18.1 and Ensembl version 88.38.

normal_tissue.tsv.zip
TSV-file, 4.4 MB
 
2 Pathology data
Staining profiles for proteins in human tumor tissue based on immunohistochemisty using tissue micro arrays and log-rank P value for Kaplan-Meier analysis of correlation between mRNA expression level and patient survival. The tab-separated file includes Ensembl gene identifier ("Gene"), gene name ("Gene name"), tumor name ("Cancer"), the number of patients annotated for different staining levels ("High", "Medium", "Low" & "Not detected") and log-rank p values for patient survival and mRNA correlation ("prognostic - favourable", "unprognostic - favourable", "prognostic - unfavourable", "unprognostic - unfavourable"). The data is based on The Human Protein Atlas version 18.1 and Ensembl version 88.38.

pathology.tsv.zip
TSV-file, 3.4 MB
 
3 Subcellular location data
Subcellular localization of proteins based on immunofluorescently stained cells. The tab-separated file includes the following columns: Ensembl gene identifier ("Gene"), name of gene ("Gene name"), gene reliability score ("Reliability"), enhanced locations ("Enhanced"), supported locations ("Supported"), Approved locations ("Approved"), uncertain locations ("Uncertain"), locations with single-cell variation in intensity ("Single-cell variation intensity"), locations with spatial single-cell variation ("Single-cell variation spatial"), locations with observed cell cycle dependency (type can be one or more of biological definition, custom data or correlation) ("Cell cycle dependency"), Gene Ontology Cellular Component term identifier ("GO id")
The data is based on The Human Protein Atlas version 18.1 and Ensembl version 88.38.

subcellular_location.tsv.zip
TSV-file, 161.8 KB
 
4 RNA gene data
RNA levels in 64 cell lines and 37 tissues based on RNA-seq. The tab-separated file includes Ensembl gene identifier ("Gene"), analysed sample ("Sample") and transcripts per million ("Value" and "Unit"). The data is based on The Human Protein Atlas version 18.1 and Ensembl version 88.38.
RNA sequencing data for human tissue
RNA sequencing data for human cell lines

rna_tissue.tsv.zip
TSV-file, 3.7 MB
rna_celline.tsv.zip
TSV-file, 6.2 MB
 
5 RNA isoform data
RNA levels in 64 cell lines and 37 tissues based on RNA-seq. The tab-separated file includes Ensembl gene identifier ("Gene"), Ensembl transcript identifier ("Transcript"), analysed sample ("Sample") and transcript per million ("TPM"). The data is based on The Human Protein Atlas version 18.1 and Ensembl version 88.38.

transcript_rna_tissue.tsv.zip
TSV-file, 73.7 MB
transcript_rna_celline.tsv.zip
TSV-file, 51.9 MB
 
6 Data from the Human Protein Atlas in tab-separated format
This file contains a subset of the data in the Human Protein Atlas version 18.1 corresponding to the data seen in the search result. This data can also be downloaded for a resulting gene set when using the search function (via the TSV link on the result page).

proteinatlas.tsv.zip
TSV-file (gzip compressed), 1.5 MB
 
7 Data from the Human Protein Atlas in XML format
The XML file contains most of the data in the Human Protein Atlas version 18.1, including protein expression data (in normal and tumor tissues and in cell lines), antigen sequences, Western blot data for antibodies, protein array data for antibodies, RNA-seq data, external references such as UniProt identifiers, and more. The data is based on Ensembl version 88.38. The file structure is presented in the XSD-schema. This data can also be downloaded for a resulting gene set when using the search function (via the xml link on the result page).
The XML file presented here is compressed with gzip due to its size. It can be uncompressed with an archive program like 7‑zip.

proteinatlas.xml.gz
XML-file (gzip compressed), 261.8 MB
 
8 Data from the Human Protein Atlas in RDF format
This file contains a subset of the data in the Human Protein Atlas version 18.1 corresponding to the tissue annotations on gene level. This data can also be downloaded for a resulting gene set when using the search function (via the RDF link on the result page). This RDF release is BETA and will be extended and developed in coming releases. We thank Mark Thompson, Rajaram Kaliyaperumal and Eelke van der Horst (LUMC, The Netherlands), and Christine Chichester (SIB, Switzerland) for providing templates for generating the first beta-release of HPA nanopublications. Their contribution was made possible by IMI project Open PHACTS and EU FP7 project RD-Connect. This beta was developed within an ELIXIR collaboration.

proteinatlas.trig.gz
RDF trig-file (gzip compressed), 86 MB
 
9 Cell graphic
Schematic cell containing all structures annotated within the Human Protein Atlas.

cell.svg
SVG-file (vectorized graphic), 530.4 KB