Avoiding collider bias in Mendelian randomization when performing stratified analyses
- Objective: This paper introduced a novel, robust method to conduct Mendelian Randomization (MR) stratified analyses while effectively avoiding the introduction of collider bias, a common pitfall when stratifying by a variable influenced by both the exposure and the outcome.
- Methodology (Residual Collider): The solution is to construct a residual collider (\(\tilde{C}\)) by regressing the original collider (\(C\)) on the genetic instrument (\(G\)). Stratifying the MR analysis by quantiles of this residual (\(\tilde{C}\)) preserves the independence of the instrument and the stratifying variable, thus ensuring the validity of the causal estimate.
- Conclusion: The study validates that stratification by the residual collider provides unbiased causal estimates and allows for the valid investigation of causal effect heterogeneity across population subgroups defined by the collider.
PubMed: 35639294 DOI: 10.1007/s10654-022-00879-0 Overview generated by: Gemini 2.5 Flash, 28/11/2025
Key Findings: Correcting Collider Bias in MR Stratification
This paper addresses a critical methodological problem in Mendelian Randomization (MR): the bias introduced when performing stratified analyses by a variable that is a collider—a variable influenced by both the risk factor (exposure) and the outcome. The authors propose a novel, robust method to conduct MR stratification while avoiding collider bias.
- The Problem of Collider Bias in MR: A naive MR analysis performed within strata of the population defined by a collider (e.g., performing MR for the effect of Smoking on Heart Disease, stratified by a variable influenced by both Smoking and Heart Disease) induces collider bias, leading to invalid causal estimates.
- The Solution: Residual Collider Stratification: The proposed solution is to construct a new variable called the residual collider. This is calculated as the residual from a regression of the original collider on the genetic instrument(s) for the exposure.
- Mechanism: Stratifying the population by quantiles of this residual collider ensures that the genetic instrument remains independent of the conditioning variable, thereby preserving the core assumption of MR and avoiding the introduction of collider bias.
- Interpretation of Stratified Estimates: Estimates stratified on the residual collider are shown to have an interpretation that is equivalent to the estimates stratified on the original collider, allowing researchers to still investigate heterogeneity of the causal effect across different population subgroups defined by the collider, but in a valid manner.
- Simulation Validation: Extensive simulation studies demonstrated that:
- Naive MR stratification by a collider led to substantial and consistent bias.
- Stratification by the proposed residual collider successfully removed the collider bias and produced unbiased causal estimates across various scenarios, including different strengths of the instrument and different characteristics of the collider variable.
Study Design and Methods
Study Design
This was a methodological study using simulation and real-world data application to illustrate the introduction of collider bias in stratified MR analyses and validate a novel bias-correction approach.
Methodology: Residual Collider
- Define Collider (\(C\)): The variable used for stratification is a collider, influenced by the exposure (\(X\)) and the outcome (\(Y\)).
- Define Genetic Instrument (\(G\)): The IVs used in the MR analysis.
- Calculate Residual Collider (\(\tilde{C}\)): The key step is a regression of the collider \(C\) on the genetic instrument \(G\): \[C = \gamma G + \tilde{C}\] where \(\tilde{C}\) is the residual collider, which is, by definition, uncorrelated with the instrument \(G\).
- Stratified MR: The final MR analysis (e.g., IVW method) is then performed within strata (e.g., quartiles or quintiles) defined by \(\tilde{C}\).
Data Application
- Real-World Application: The method was applied to a real-world example investigating the causal effect of educational attainment on a health outcome (e.g., cognitive function) using the UK Biobank data, stratified by a post-exposure factor like employment status (a potential collider). This demonstrated the practical utility of the residual collider approach.
Conclusions and Recommendations
The study concludes that researchers can validly investigate the heterogeneity of causal effects across strata of a population in MR studies, provided they use the novel residual collider method.
- Guidance for MR Practice: The authors strongly recommend that researchers avoid naive MR stratification by a collider and instead implement the residual collider approach when studying effect heterogeneity related to a post-exposure factor.
- Generalizability: The methodology is flexible and can be applied across various MR methods and scenarios where a valid instrument is available, making it a broadly useful tool for causal inference in epidemiology.
- Future Work: Further research is suggested to extend the residual collider approach to non-linear MR models and scenarios involving multiple stratifying variables.