Protein-metabolite association studies identify novel proteomic determinants of metabolite levels in human plasma

causal inference
mendelian randomization
metabolomics
multi-omics
proteomics
systems biology
  • Approach: A large-scale multi-omics resource was created by meta-analyzing proteomic and metabolomic data from three cohorts, followed by the use of Mendelian Randomization (MR) with pQTLs to infer causal relationships.
  • Causal Findings: The study identified 224 putative causal associations between 95 proteins and 96 metabolites, including novel links like the causal role of ADAMTSL3 in regulating BCAA metabolites.
  • Validation: Over 50% of the top causal findings were experimentally validated through metabolomic profiling of mouse knockout strains, providing strong biological proof for the in-silico MR results.
Published

23 January 2026

PubMed: 37582364 DOI: 10.1016/j.cmet.2023.07.012 Overview generated by: Gemini 2.5 Flash, 27/11/2025

Background and Objective

Circulating levels of proteins and metabolites in human plasma reflect the physiological state of an individual and are strongly associated with the risk of various complex diseases, particularly cardio-metabolic disorders. While many associations have been identified, distinguishing between causal relationships (where a protein concentration change directly causes a metabolite level change) and confounded associations (where both are affected by a third factor) remains a major challenge.

The primary objective of this study was to integrate proteomic, metabolomic, and genomic data using Mendelian Randomization (MR) to systematically identify putative causal relationships between circulating proteins and metabolites in human plasma.

Study Methods and Data Integration

The study employed a large-scale, multi-stage, multi-omics approach:

  1. Data Cohorts: Quantitative proteomic (1,302 proteins) and metabolomic (365 metabolites) data were meta-analyzed across three large population studies (Jackson Heart Study, Multi-Ethnic Study of Atherosclerosis, and Health, Risk Factors, Exercise Training and Genetics), totaling 3,626 individuals.
  2. Pairwise Association Analysis: The study first identified 172,000 significant pairwise correlations between proteins and metabolites across the three cohorts.
  3. Causal Inference via Mendelian Randomization (MR): To overcome confounding, two-sample MR was applied using genetic instruments derived from protein-quantitative trait loci (pQTLs), specifically those located near the coding region of 535 proteins. These genetic variants served as instrumental variables to assess the causal effect of protein levels (exposure) on metabolite levels (outcome).
  4. Meta-Analysis and Validation: Causal estimates were meta-analyzed across the three studies. Sensitivity analyses (MR-Egger, weighted median) were performed to check for robustness against pleiotropy.
  5. In Vivo Validation: To provide biological proof-of-concept, the top-ranking protein-to-metabolite causal associations were validated in vivo using metabolomic profiling of mouse knockout strains for the corresponding genes.

Key Results and Findings

Causal Protein-Metabolite Associations

  • The MR analysis identified 224 putative causal associations between 95 proteins and 96 metabolites.
  • Novel Findings: Many of these causal links were novel, offering new insights into metabolic regulation. For instance, the study confirmed the causal role of Apolipoprotein C-III (ApoC3) in increasing triglycerides but also identified novel links, such as the causal role of protein ADAMTSL3 in regulating several plasma metabolites, particularly branched-chain amino acid (BCAA) metabolites.

Network and Pathway Insights

  • The study confirmed known metabolic hubs but also identified novel networks where proteins regulate metabolites. Many of the causal links highlighted pathways relevant to lipid metabolism and amino acid catabolism, known drivers of cardio-metabolic risk.

Conclusions and Significance

This research successfully established a robust multi-omics-to-causality pipeline by integrating large-scale human proteomic, metabolomic, and genomic data. By using Mendelian Randomization and subsequent in vivo validation, the study identified 224 high-confidence protein-to-metabolite causal associations.

These findings significantly advance the understanding of the molecular determinants of metabolic traits, providing a valuable resource for identifying novel therapeutic targets for cardiovascular and metabolic diseases.