Multi-INTACT: integrative analysis of the genome, transcriptome, and proteome identifies causal mechanisms of complex traits
- Method: Multi-INTACT is a novel statistical framework that jointly analyzes GWAS, eQTL, and pQTL summary statistics to model the causal chain from genetic variant \(\rightarrow\) gene expression \(\rightarrow\) protein level \(\rightarrow\) complex trait.
- Causal Partitioning: The method successfully partitions GWAS heritability and identifies the precise molecular layer (transcriptome or proteome) mediating the genetic effect, showing that many effects are primarily mediated by protein levels.
- Impact: Applied to complex traits like lipids, Multi-INTACT confirmed known genes and revealed novel gene-trait associations by providing the mechanistic evidence (the specific regulatory path) driving the GWAS signal.
PubMed: 39901160 DOI: 10.1186/s13059-025-03480-2 Overview generated by: Gemini 2.5 Flash, 27/11/2025
Background and Objective
Genome-Wide Association Studies (GWAS) have localized thousands of single nucleotide polymorphisms (SNPs) associated with complex traits. However, most of these variants reside in non-coding regions, making it challenging to identify the effector genes and the complete molecular cascade by which they influence disease. It is widely hypothesized that genetic variants primarily act by perturbing molecular intermediate traits, such as gene expression (transcriptomics) and protein levels (proteomics).
This paper introduces Multi-INTACT (Integrative Analysis of Causal Transcriptome and Proteome), a novel statistical framework designed to: 1. Jointly analyze GWAS, eQTL (expression quantitative trait loci), and pQTL (protein quantitative trait loci) summary statistics. 2. Systematically identify the causal chain from genetic variant to gene expression to protein level, and finally to the complex trait outcome. 3. Quantify the proportion of GWAS heritability mediated by both transcriptional and proteomic regulation.
Methods: The Multi-INTACT Framework
Data Integration and Causal Modeling
Multi-INTACT extends the functionality of the existing INTACT method by incorporating three molecular layers: genome, transcriptome, and proteome.
- SNP-to-Gene Expression (eQTL) and SNP-to-Protein (pQTL) Mapping: The method simultaneously estimates the joint effects of all SNPs in a locus on both gene expression and protein abundance.
- Causal Chain Analysis: Multi-INTACT employs a multi-mediator causal model based on the Inverse-Variance Weighted (IVW) method from Mendelian Randomization (MR). It models the complex trait (outcome) as being causally influenced by protein levels (direct mediator) and gene expression (indirect mediator). This allows the tool to distinguish between genetic effects that flow through expression, through protein, or through a combination.
- Heritability Partitioning: The framework also quantifies the proportion of the GWAS heritability explained by the genetic effects on gene expression (\(\text{h}^2_E\)) and on protein levels (\(\text{h}^2_P\)).
Application
The method was applied to publicly available summary statistics for seven complex traits (e.g., triglycerides, HDL cholesterol) and integrated with large-scale multi-omics datasets from various tissues (e.g., adipose, muscle, liver).
Key Results and Findings
Causal Genes and Mediation
- Significant Causal Proteins: Multi-INTACT identified numerous proteins whose plasma levels were found to be causally associated with the complex traits studied.
- Transcriptional and Proteomic Mediation: The framework successfully partitioned the heritability, demonstrating that many GWAS signals are indeed mediated by changes in molecular phenotypes. For example, for lipids, the method identified regulatory effects flowing from genetic variants, through gene expression, and subsequently through the regulation of Apolipoproteins (proteins) to affect the final trait.
Novel Regulatory Mechanisms
The joint modeling allowed the authors to uncover specific regulatory mechanisms that would be missed by single-omics or simpler MR approaches:
- Protein-Mediated Effects: The analysis confirmed the involvement of genes known to be related to the traits (e.g., LPL and APOE for lipids) and refined the causal path, often showing that the effects are primarily mediated by the protein level, not just the mRNA level.
- Novel Loci: Multi-INTACT uncovered several novel gene-trait associations and provided mechanistic evidence that these associations are driven by molecular regulation.
Conclusions and Significance
Multi-INTACT provides a sophisticated and statistically rigorous framework for performing causal integration of tri-omics data (genome, transcriptome, proteome). It successfully moves beyond simple association to illuminate the molecular mechanism underlying GWAS hits.
The ability to jointly model and partition heritability across the transcriptional and proteomic layers is crucial for pinpointing the exact effector genes and the most proximal molecular target for therapeutic intervention, accelerating the translation of GWAS findings into clinical insights.