Evolution of Biomarker Validation
Overview
Most recently, an international team of experts from the International Society for Immunohistochemistry and Molecular Morphology (ISIMM) as well as the International Network for Quality in Pathology (IQN Path) developed new concepts and a new framework for biomarker validation in immunohistochemistry (IHC). Validation is ‘‘confirmation, through the provision of objective evidence, that requirements for a specific intended use or application have been fulfilled’’ (ISO 9000).“Intended use” or “intended application” are synonymous with “purpose” meaning that validation will result in a test that is fit-for-purpose. Provision of “objective evidence” is fulfilled by collecting evidence that relevant test performance characteristics have been identified and assessed. Validation provides a “high degree of assurance” that these needs will be met. This presentation will explore how these three components of validation (intended use, objective evidence, and high degree of assurance) can be incorporated into the validation of biomarkers for in-situ cell-based assays such as IHC.
Learning Objectives:
- Overview of terminology and definitions relevant to biomarker validation for in situ, cell-based assays.
- Describe components and processes for IHC biomarker validation.
- Describe tools required for biomarker validation.
If you have viewed this educational webinar, training or tutorial on Knowledge Pathway and would like to apply for continuing education credits with your certifying organization, please download the form to assist you in adding self-reported educational credits to your transcript.
Apply for self-reported educational creditsWebinar Transcription
In this presentation, we will briefly review the terminology and definitions that are critical for conceptualizing and conducting validation of biomarkers, as applied to cell-based in situ environments, such as immunohistochemistry.
Validation of immunohistochemistry biomarkers has several important components and processes. They will be described here today.
Finally, as we all know, any trade requires good tools. We will briefly review the different possible choices that are currently available when choosing the right tools for immunohistochemistry biomarker validation.
I will use this slide to illustrate why immunohistochemistry is not a quantitative assay, but rather, a descriptive assay.
The fundamental design of the immunohistochemistry reaction is shown in this diagram. What we see is that immunohistochemistry is an in situ immunoassay. Their specific primary antibody binds to its target isotype or antigen, if you will, on an unstained section.
The word “immunohistochemistry” has three components. “Immuno”, referring to antibody-antigen reaction; “histo”, when histology tissue sections have been used; and “chemistry”, which consists of developing the chemical reaction at the end of the process.
In most immunohistochemistry tests, we would not be able to see any signals coming from the primary antibody without using amplification.
Amplification means that once the reaction between the primary antibody and antigen have occurred, we make it visible by light microscopy using various molecules that are bound to the secondary antibody, as depicted here.
Therefore, the chromatin that will develop at the site of the reaction is not directly linked to the primary antibody. This can be achieved in several ways.
Avidin-biotin-based systems were popular in the past. But the fact that human tissue contains endogenous biotin was problematic. Other systems were developed in order to increase the specificity of the reaction so that we do not get false-positive results in cells that contain endogenous biotin.
Today, there are several different types of amplification systems, also called detection systems, which are based on synthetic polymers and monomers.Such polymers do not exist in human tissues.
And because of that, it is easy to ensure that our immunohistochemistry reactions will be specific, and that the brown signal deposited in the tissue really reflects through specific binding of the primary antibody, which started antigen in the cells.
Once the brown, or let’s say some other signal, but let’s refer to DAB because it’s the most commonly used brown signal in this type of reaction. It can be further emphasized and made darker. This step is called enhancement. And it’s used at the end of immunohistochemistry reaction.
Many times, enhancement is achieved by using copper sulfate. In the past, we used an osmium-base compound. And now, we have the amplification step that occurs after the reaction between the primary antibody and the target antigen. Immunohistochemistry is, in some ways, actually very similar to basic PCR reaction.
The purpose of this very lengthy description of an immunohistochemistry assay is to emphasize that the intensity of signal at the end of the immunohistochemistry assay, most of the time, does not reflect the amount of antigen present in the cells.
The intensity of the signal definitely relates, in one way or another, to the amount of expressed protein. But it is only one of the variables. And how much it contributes to the final staining result is, most of the time, unknown. And it is not intuitive.
The second component is the composition of the amplification system. Amplification may have one or more steps. More steps will lead to more amplification.
After that, the next component is the time we allow for the chromatin to develop. If this time is very short, then the intensity may be weaker than if we allow several minutes for this process to occur.
Finally, we can significantly change the intensity of signal if we add an enhancement with copper sulfate, which will make the DAB appear very dark brown or almost black.
We have now established that the immunohistochemistry assays generate results that lack proportionality to the amount of protein that we are detecting in this assay.
Therefore, we can say with certainty that immunohistochemistry is a descriptive assay, because the definition of a descriptive assay is that “They generate practical data that lack proportionality to the amount of analyte in a sample.”
Immunohistochemistry is a cell-based, in situ immunological assay. But let us remember that it is not the only immunological assay that is cell-based. Flow cytometry is cell-based, but it is not in situ. ELISA is an immunological assay, but it’s neither cell-based or in situ.
Some in situ assays are not immunological. For example, DNA and RNA in situ hybridizations are cell-based and in situ, but are not immunological in nature.
This image illustrates that even if you could somehow make immunohistochemistry more quantitative, meaning that a test would be more proportional to the amount of expressed protein. There is a descriptive component that we cannot remove from immunohistochemistry assay.
Immunohistochemistry is always descriptive, because pathologists interpret cells as positive only if the signal localization is appropriate.
This Hodgkin lymphoma shows that the Hodgkin cells are negative for BCL6 and that many cells in the - - are positive for BCL6. We do not use any type of points for this readout. We simply look at Hodgkin’s cells, which we can recognize by size and shape, and read out absence of brown signals in all nuclei of the malignant cells.
However, readouts would not be any more correct if we were to use any quote, unquote “cutoff” for the number of positive cells, even if all cells in the background are positive, which may account for 90% of cells. We still read this result of negative since the signal is not where it should be.
This in situ nature of immunohistochemistry assay should never be neglected.
Let’s go back to enhancement and examine what it does to immunohistochemistry results. Question for the audience. Do you use enhancements of the immunohistochemistry reaction at the end of your protocol?
Do you know that intensity of staining in general, on average, could be approximately double of what it would be without the enhancement?
The immunohistochemistry protocol on the left was performed without enhancement. The one on the right did use enhancement. There’s the same amount of protein in both sections. But the signal is much darker on the right than on the left.
We have to pause for a moment and contemplate what this means if we were to use image analysis to measure the intensity of signal. Would our image analysis results have anything to do with the actual protein content in this tissue?
Clearly, enhancement is not the only factor that can cause misleading results with image analysis if we would compare the results from two different laboratories using different protocols.
Unfortunately, there are also parameters that also interfere with image analysis, but this is beyond the scope of this presentation.
Let us conclude now that immunohistochemistry test is not a quantitative test, and that the sensitivity of immunohistochemistry does correlate well with intensity of signal since intensity is dependent on many factors, other than the amount of protein.
Another question for the audience: which chromogranin test shows higher analytical sensitivity? This image is taken from the NordiQC website. They have many useful and illustrative images.
This is pancreas. And we see that islets of Langerhans stained stronger on the right than on the left. Do not think that this is a trick question. It is not.
If you are not sure and you answer with anybody, please feel free to let your mind go. In other words, which test would you prefer to have in your lab?
The answer to this question lies not in the islets of Langerhans, but elsewhere. Instead of focusing on the Islets, we must look to what is happening in tissues that we know to have a low level of expression of chromogranin.
The cross section of the appendix depicted here in row two, shows accents, which are demonstrated as positive on the left but a negative on the right.
Similarly, this fully differentiated neuroendocrine tumor with a low expression of chromogranin that is depicted in the bottom row, shows positivity on the left. And it’s negative on the right.
Therefore, the immunohistochemistry protocol on the left is more sensitive, despite being lighter in signal intensity, and despite the fact that both protocols use the same primary antibody.
How is this possible? It is possible because these two laboratories use differently designed immunohistochemistry protocols.
This image helps to reinforce a statement about proportionality between the analyte and the results we achieved with an immunoassay.
This is a comparison between immunofluorescence and immunohistochemistry. It is theoretically much easier to develop an immunofluorescence immunoassay that is linear, than to develop an immunohistochemistry assay that is even partially linear.
We have mentioned that qualitative assays are proportional. Proportional and linear relationships both create a straight line on a graph. Proportional assays start at 0, while a linear assay may start at +1, +2, +5, or anything else on the y-axis.
Linearity means that if there is more antigen in the tissue, then there will be more intense signal. This is not so for immunohistochemistry. We see that immunohistochemistry is not linear. And it’s certainly not proportional.
The concern is that we do not know what this curve looks like for any of our clinical immunohistochemistry assays.
In this example, the same lung cancer was stained by the 5A4 clone for ALK.
These three laboratories that participated in the Canadian immunohistochemistry quality control proficiency statement program achieved a very different intensity of signal, respect of the fact that this is the same tumor, with the same tissue processing, and the same clone.
We wonder which of these tests have the highest diagnostic sensitivity. In other words, is intensity of signal related to our ability to properly stratify patients for targeted therapy? Can we say that the most intensely stained sample was tested with the most diagnostically sensitive assay?
We cannot. All three tests showed the same diagnostic sensitivity in this proficiency test in realm. All three protocols correctly identified the patients that were positive for ALK, and therefore, eligible for treatment with crizotinib. This was also confirmed by FISH.
Pathologists typically prefer to see stronger signals that are easier to interpret. But the reality is that we should not be too harsh, for as the protocols that still perform as they should for clinical practice, even if we may not like how they look.
Papers like this one do not do any justice to immunohistochemistry. This is not the only paper that states that immunohistochemistry may not be so good over tests for precision medicine.
Most such publications use arguments that immunohistochemistry does not measure precisely the amount of expressed protein and that this is a problem. Let us examine this more and see just how big of a problem it really is.
We are looking at an immunohistochemistry slide of a patient with acute myeloid leukemia. The immuno-histochemistry test is empty in one. As a hemato-pathologist, I’m only interested in whether I can see signal.
Why? Because normal non-mutated nucleophosmin is 100% in the nucleus, while mutated NPM1 is not so efficient between - - in the nucleus. And it’s also present in the cytoplasm.
We can see that in this case, the cytoplasm is staining. And therefore, we can conclude that antiem-1 is mutated. The test is validated to work like this. If we performed DNA sequencing for the presence of this mutation, we would get the same results.
Do I want to know exactly how much of the mutated protein is there? No, I do not. Clinically, it makes no difference whether the amount of protein is many times more or less. In this validated assay, it only matters that the presence of cytoplasmic staining correlates 100% with mutated NPM1. No measurement. No matter.
In the case of lung cancer with an ALK translocation and the AML patient with NPM1 mutation, the exact intensity of signal does not determine whether we are getting the correct result.
In lung cancer, we would stratify the patients properly for crizotinib, in respect of the intensity of staining.
And in AML, the patients with an NPM1 mutation do better than those without. We would still stratify the patients for appropriate therapeutic considerations based on the under-presence or absence of signal in the cytoplasm, without any measurement of the intensity of signal.
This paper from 2006 illustrates that a question of quantification in immunohistochemistry is a big one and cannot simply be dismissed.
Please consider that every qualitative test ultimately has a quantitative component because of the cutoff point, especially at the point where the reaction switches from being negative to being positive.
This point is essentially quantitative in nature. However, this does not mean that the test needs to be proportional or linear or that we must measure how much protein is there.
It only means that it is essential that a proper threshold be achieved between a positive and negative sample.
From the published literature, we know that clinical trials did not show that more intense staining for PD-L1 in lung cancer correlated better with survival of patients with lung cancer.
What the trial showed is that if more than 50% of cells of the tumor are positive, there is a good correlation with positive response to treatment with pembrolizumab. Different cutoff points, such as 1% cutoff point or other may be applicable to other drugs.
We come back now to Hodgkin lymphoma and the BCL6 immunohistochemistry test. I only want to know that Hodgkin’s cells are negative, or at least at most, or many, are negative for BCL6.
Measuring the amount of BCL6 here is similar to measuring the amount of NPM1 in AML and does not make sense. Having lots of NPM1 in the nucleus is irrelevant if it’s not present in the cytoplasm, having lots in the cytoplasm, rather than less, is also relevant.
Therefore, each immunohistochemistry test is based on the biology of the underlying disease, as well as specific applications, which ultimately depend, especially in the area of targeted therapies, on how these markers were used in clinical trials.
I would not recommend that we try to make immunohistochemistry look better and then look in the clinical trials, even if not very good. The best test for targeted therapy is the one that was used in the clinical trial, in respect of whether we were happy or not happy with these types of tests.
It is important to remember this distinction about what we consider to be quote, unquote “better”. To develop a test that is better-looking than the one used in the - - regional clinical trials, we can use all sorts of interesting lab tweaks.
However, to develop a test that is better at predicting which patients will respond to a particular therapy than the one used in the original clinical trial, this would require that we conduct a new clinical trial altogether.
What we consider quote, unquote “better” and easier to work with in pathology and better from a clinical and meaningful perspective, could be completely different.
This image from the NordiQC site illustrates that it is possible to develop an immunohistochemistry assay that represents what is happening at the genetic level. The breast cancers are both labeled with ISH.
The one on the left shows no amplification of HER-2 gene, but the one on the right does show amplification. We see how well the intensity of the complete membranous staining correlates in these two examples.
More amplification of the gene, more protein expression is demonstrated by immunohistochemistry. Therefore, some immunohistochemistry tests are more linear than others.
And to keep it so, when that really matters, such as in this example, we need to follow international guidelines, such as ASCO-CAP guidelines, regarding pre-analytical, analytical, and post-analytical component.
However, linearity is not always useful. Sometimes, it is very difficult or impossible to develop a more linear test without affecting the robustness of the test.
By this, I mean that when immunohistochemistry tests are finally choose to be linear, they may lose analytical sensitivity for some samples, especially those that had suboptimal, pre-analytical conditions.
The results with