Confidence and Uncertainty - A multilabel AI-based model for evaluating protein expression in testis

Schematic study overview from Ghoshal et al. (2021).

In a study led by researchers in the HPA and at Brunel University London, a novel method for automated annotation of immunohistochemistry images was developed for annotating cell type-specific protein expression of 8 different cell types in human testis. The work comprised 7848 images (corresponding to 2794 proteins) and the image classifier also provided a novel uncertainty metrics (called DeepHistoClass), for identification of manual annotation errors. The workflow can be implemented for other tissues or utilized in large-scale protein mapping efforts for sourcing high-quality data.

For several decades, immunohistochemistry (IHC) methods have served as reliable and robust tools for studying the expression of proteins in diseased and healthy tissues. IHC provides valuable information of the overall spatial distribution of a protein at a compartment, cellular and subcellular level - but also in the context of neighbouring cells and relevant histological structures. The standard method for evaluating IHC protein stainings is still today a rather subjective assignment, heavily reliant on the manual observer's expertise. Manual annotation is however error-prone and time-costly, which ultimately leads to great challenges when aligning IHC datasets with other quantitative methods such as RNA-seq and scRNA-seq.

To speed up this process, the application of advanced deep learning and neural network models has received increased attention in biomedical research, for example within the field of digital pathology. In the present investigation, the researchers took advantage of the high-quality IHC testis dataset at the Human Protein Atlas (HPA) and applied a Hybrid Bayesian Neural Network (HBNet), which did not only recognize staining patterns at cell type-specific level for eight different cell types, but also provided a novel uncertainty metric score, that combines uncertainty with the predictive label probability. This means that the model is able to show which images are reliably classified by the model, but also highlight those with manual annotation errors.

"When evaluating our model performance at cell type-specific level, we achieved at least 80 percent accuracy for all cell types. In general, the HBNet showed a higher accuracy compared to a standard deep neural network model." said Dr. Cecilia Lindskog - senior author of the paper.

Across all cell types, the false positive rate was lower compared to false negatives indicating that the model performed better at accurately detecting positive labels, but more often differed with the human observer in classifying cell types as negative. Dr. Lindskog explains: "We could see that the human observer more often neglects weakly stained structures and patterns in the testis tissues, probably because they are considered unspecific or artifactual." The uncertainty metrics named DeepHistoClass (DHC) presents a score between 0 to 1, where correctly classified images have scores closer to 1 and misclassified images tend to have low scores, closer to 0. In general, the model predictions showed a low level of uncertainty.

In summary, the study suggests a feasible framework that not only increases the consistency and quality of the protein expression annotation, but also catches misclassifications where individual errors can be addressed by a manual inspection.

The paper was published online on the 20th of August 2021 and the full text can be accessed in the Journal of Molecular & Cellular Proteomics.

Feria Hikmet Noraddin