The esophagus-specific proteome

The main function of the esophagus is to transport swallowed food and liquids to the stomach. This approximately 25 cm long tube consists of outer layers of striated and smooth muscle, for mechanical propulsion of food, and an inner mucosa lined by non-cornified squamous epithelia. The transcriptome analysis shows that 69% (n=13499) of all human proteins (n=19613) are expressed in the esophagus and 251 of these genes show an elevated expression in esophagus compared to other tissue types. A Gene ontology analysis shows that a majority of these proteins are related to epithelial cell function, and these proteins are also expressed in other squamous mucosa including oral mucosa, vagina and exocervix.

  • 43 esophagus enriched genes
  • Most of the tissue enriched genes encode proteins involved in epithelial function
  • 251 genes defined as elevated in the esophagus
  • Most group enriched genes share expression with skin

Figure 1. The distribution of all genes across the five categories based on transcript abundance in esophagus as well as in all other tissues.

251 genes show elevated expression in the esophagus compared to other tissues. The three categories of genes with elevated expression in esophagus compared to other organs are shown in Table 1. In Table 2, the 12 genes with the highest level of expression among 43 enriched genes are defined.

Table 1. Number of genes in the subdivided categories of elevated expression in esophagus.

Category Number of genes Description
Tissue enriched 43 At least five-fold higher mRNA levels in a particular tissue as compared to all other tissues
Group enriched 100 At least five-fold higher mRNA levels in a group of 2-7 tissues
Tissue enhanced 108 At least five-fold higher mRNA levels in a particular tissue as compared to average levels in all tissues
Total 251 Total number of elevated genes in esophagus

The list of tissue enriched genes (n=43) includes previously characterized genes with cellular location and functions well in-line with the function of the esophagus, as well as a large number of genes with unknown function and expression pattern.

Table 2. The 12 genes with the highest level of enriched expression in esophagus. "Predicted localization" shows the classification of each gene into three main classes: Secreted, Membrane, and Intracellular, where the latter consists of genes without any predicted membrane and secreted features. "mRNA (tissue)" shows the transcript level as TPM values, TS-score (Tissue Specificity score) corresponds to the score calculated as the fold change to the second highest tissue.

Gene Description Predicted localization mRNA (tissue) TS-score
MUC22 mucin 22 Membrane 20.2 116
KRT6C keratin 6C Intracellular 525.3 50
MUC21 mucin 21, cell surface associated Membrane 596.4 49
CAPN14 calpain 14 Intracellular 88.2 26
KRT4 keratin 4 Intracellular 14861.9 25
IGFL1 IGF like family member 1 Secreted 97.7 22
KRT13 keratin 13 Intracellular 35138.5 20
CRNN cornulin Intracellular 4518.8 20
UGT1A7 UDP glucuronosyltransferase family 1 member A7 Intracellular,Membrane 61.4 13
SPRR1A small proline rich protein 1A Intracellular 9968.1 12
TGM3 transglutaminase 3 Intracellular 1513.9 12
TGM1 transglutaminase 1 Intracellular 831.8 12

Some of the proteins predicted to be membrane-spanning are intracellular, e.g. in the Golgi or mitochondrial membranes, and some of the proteins predicted to be secreted can potentially be retained in a compartment belonging to the secretory pathway, such as the ER, or remain attached to the outer surface of the cell membrane by a GPI anchor.

The esophagus transcriptome

An analysis of the expression levels of each gene made it possible to calculate the relative mRNA pool for each of the categories. The analysis show that 65% of the mRNA molecules in the esophagus correspond to housekeeping genes and 26% of the mRNA pool corresponds to genes categorized to be either esophagus enriched, group enriched, or enhanced in esophagus. Thus, most of the transcriptional activity in the esophagus is related to proteins with presumed housekeeping functions as they are found in all tissues and cells analyzed.

A Gene ontology analysis of the esophagus-enriched genes (n=43) show an overrepresentation of genes related to cell envelope organization, external encapsulating structure organization, keratinization, keratinocyte differentiation and epithelial and epidermal cell differentiation. Compared to skin, which shares many features with esophagus, Gene ontology analysis show that the major difference is that the esophagus does not have top-hit gene categories associated with water homeostasis and melanin biosynthesis.

Protein expression of genes elevated in esophagus

In-depth analysis of the elevated genes in esophagus using antibody-based proteomics allowed us to create an overview of the localization of the corresponding proteins. A large number of these proteins have functions related to squamous differentiation and are thus often also shared with other tissue types that are composed of squamous epithelia.

Proteins specifically expressed in esophagus

The inner lining of the esophagus is made up by glycoprotein-rich mucosal squamous epithelium that lacks an outer layer of cornified cells (as in the skin). Like most squamous epithelia, the esophagus express a variety of keratin intermediate filaments proteins whose function is to provide structural integrity between the cells. Among structural proteins, Keratin 4 (KRT4), -6 (KRT6A, KRT6B and KRT6C) -13 (KRT13), and -32 (KRT32) showed high enrichment together with the calcium-binding proteins cornulin (CRNN) and S100A14. KRT13 is primarily expressed in the mucosal epithelia, as MUC21, which is observed in esophageal epithelial cells. An interesting enriched protein is the alcohol-degrading enzyme ADH7, which is observed in mucinous epithelial cells of the esophagus and stomach.

KRT6A - Keratin 6
CRNN - Cornulin

Proteins specifically expressed in esophageal muscle

Among the genes that show enrichment in the esophagus but do not show protein expression in the epithelial cells are two muscle-specific genes: MYH8, and NKX6-1. Whereas MYH8 is a well-known muscle-specific gene that is group enriched in esophagus and skeletal muscle, the transcription factor NKX6-1 appears to be specifically expressed in muscles in the esophagus and is previously not described in this tissue.

Genes shared between esophagus and other tissues

There are 100 group enriched genes expressed in the esophagus. Group enriched genes are defined as genes showing a 5-fold higher average level of mRNA expression in a group of 2-7 tissues, including esophagus, compared to all other tissues.

In order to illustrate the relation of esophagus tissue to other tissue types, a network plot was generated, displaying the number of genes shared between different tissue types.

Figure 2. An interactive network plot of the esophagus enriched and group enriched genes connected to their respective enriched tissues (grey circles). Red nodes represent the number of esophagus enriched genes and orange nodes represent the number of genes that are group enriched. The sizes of the red and orange nodes are related to the number of genes displayed within the node. Each node is clickable and results in a list of all enriched genes connected to the highlighted edges. The network is limited to group enriched genes in combinations of up to 3 tissues, but the resulting lists show the complete set of group enriched genes in the particular tissue.

The esophagus shares a striking amount of transcripts (n=37) with skin, which is a tissue with highly similar squamous epithelial structure as the esophagus. Many of these skin/esophagus group enriched genes belong to gene families known to be important for normal squamous epithelial function. Gene ontology-based analysis on these 37 common genes shared between esophagus and skin reveal that the top shared categories are related to epidermal and epithelial development, as well as keratinocyte and epidermal cell differentiation.

The tonsil also has squamous epithelium components (an addition to its lymphocyte containing center) and several transcripts are shared between esophagus and tonsil (n=9). Examples of these esophagus and tonsil group enriched proteins include the calcium-binding protein S100A2 and the proteinase and peptidase inhibitors CSTA and SPINK7.

Several genes expressed both in skin and esophagus are previously well characterized in both tissue types, as proteins important for the normal differentiation and function of squamous epithelia, e.g. keratins including keratin 5 (KRT5), 15 (KRT15), and 31 (KRT31), and genes related to cell adhesion and squamous differentiation (e.g. desmoplakin 1 (DSP), envoplakin (EVPL), desmocollin 3 (DSC3), SLURP1 and KLK8.

As other keratins, the type I cytokeratin 15 (KRT15) is important for the structural integrity of epithelial cells. KRT15 is group enriched in esophagus, skin and breast.

KRT15 - esophagus
KRT15 - skin
KRT15 - breast

The secreted LY6/PLAUR domain containing 1 (SLURP1) protein is a member of the Ly6/uPAR family of proteins but lacks a GPI-anchoring signal sequence. SLURP1 is suggested to be involved in late differentiation, predominantly expressed in the granular layer of skin. Moreover, SLURP1 is identified in several biological fluids such as sweat, saliva, tears, and urine. It is thought that this secreted protein exerts antitumor activity.

SLURP1 - skin
SLURP1 - esophagus

Esophagus function

The esophagus is the gastrointestinal canal that connects the mouth with the stomach. In contrast to the rest of the digestive system, the esophagus does not have any absorptive or digestive functions. Anatomically, it is continuous with the back of the oral cavity and pharynx and runs downward through the diaphragm for approximately 20-30 cm until it reaches the stomach.

When swallowing, food is pressed from the mouth and pharynx into the esophagus. The swallowing reflex then opens the upper esophageal sphincter muscle to allow entry of food to the esophagus and the epiglottis folds down to prevent food from entering into the trachea and respiratory organs. The smooth muscles lining the length of the esophagus then contract rhythmically to help push the food towards the lower esophageal sphincter muscle that opens to allow entry of food to the stomach. Both the upper and lower sphincter muscles are constricted by default unless swallowing/vomiting. The lower sphincter muscles also protect the esophagus from the acidic contents and digestive enzymes of the stomach.

Esophagus histology

The esophagus has the same general gross anatomical and histological organization as the rest of the gastrointestinal tract with an outer muscular layer, a submucosa, followed by a muscularis mucosa layer, followed by a lamina propria that surrounds the inner "tubing" which in the case of the esophagus consists of a stratified squamous mucinous epithelium. However, since the esophagus is located outside of the abdominal cavity it has no mesothelial covering. Instead, the outermost layer is covered by connective tissue, so called adventitia.

The innermost part is the esophageal epithelium, which has a quite rapid turnover of cells due to the continuous wear and tear of food ingestion. Like most epithelial tissues, cell renewal takes place in the basal part of the epithelium and as new cells are generated, older cells lose contact with the basal membrane and are pushed towards the surface. At the beginning, cells close to the basal layer appear columnar with round nuclei, but as cells mature and detach, they are pushed towards the apical layer of the epithelium, and gradually differentiate into flattened and tightly coupled cells.

From the inside and out, the squamous epithelium rests on the lamina propria that consists of loose connective tissue and focal lymphocytes. After this layer comes the lamina muscularis mucosae that is composed of smooth muscle cells, followed by the submucosal layer, which is composed of loose connective tissue containing mucus secreting glands, small blood vessels and lymphocytes. After the submucosa comes the tunica muscularis that is composed of an inner layer of circular muscles, followed by externally located longitudinal muscle fibers. In the third of the esophagus that is closed to the mouth, the external layer is composed of skeletal muscle, the middle third it contains a mixture of smooth and skeletal muscle, and the third closest to the stomach, it contains only smooth muscle.

The histology of human esophagus including detailed images and information can be viewed in the Protein Atlas Histology Dictionary.


Here, the protein-coding genes expressed in the esophagus are described and characterized, together with examples of immunohistochemically stained tissue sections that visualize protein expression patterns of proteins that correspond to genes with elevated expression in the esophagus.

Transcript profiling and RNA-data analyses based on normal human tissues have been described previously (Fagerberg et al., 2013). Analyses of mRNA expression including over 99% of all human protein-coding genes was performed using deep RNA sequencing of 172 individual samples corresponding to 37 different human normal tissue types. RNA sequencing results of 3 fresh frozen tissues representing normal esophagus was compared to 169 other tissue samples corresponding to 36 tissue types, in order to determine genes with elevated expression in esophagus. A tissue-specific score, defined as the ratio between mRNA levels in esophagus compared to the mRNA levels in all other tissues, was used to divide the genes into different categories of expression. These categories include: genes with elevated expression in esophagus, genes expressed in all tissues, genes with a mixed expression pattern, genes not expressed in esophagus, and genes not expressed in any tissue. Genes with elevated expression in esophagus were further sub-categorized as i) genes with enriched expression in esophagus, ii) genes with group enriched expression including esophagus and iii) genes with enhanced expression in esophagus.

Human tissue samples used for protein and mRNA expression analyses were collected and handled in accordance with Swedish laws and regulation and obtained from the Department of Pathology, Uppsala University Hospital, Uppsala, Sweden as part of the sample collection governed by the Uppsala Biobank. All human tissue samples used in the present study were anonymized in accordance with approval and advisory report from the Uppsala Ethical Review Board.

Relevant links and publications

Uhlén M et al, 2015. Tissue-based map of the human proteome. Science
PubMed: 25613900 DOI: 10.1126/science.1260419

Yu NY et al, 2015. Complementing tissue characterization by integrating transcriptome profiling from the Human Protein Atlas and from the FANTOM5 consortium. Nucleic Acids Res.
PubMed: 26117540 DOI: 10.1093/nar/gkv608

Fagerberg L et al, 2014. Analysis of the human tissue-specific expression by genome-wide integration of transcriptomics and antibody-based proteomics. Mol Cell Proteomics.
PubMed: 24309898 DOI: 10.1074/mcp.M113.035600

Histology dictionary - the esophagus