Blood protein assessment of leading incident diseases and mortality in the UK Biobank
- Objective: To identify and validate protein biomarkers for the 10-year incidence of 23 age-related diseases and all-cause mortality using proteomics data from 47,600 UK Biobank participants.
- Key Result: Multi-protein risk scores (ProteinScores), developed using penalized Cox regression, significantly improved the Area Under the Curve (AUC) for the 10-year onset prediction of six major outcomes, including all-cause mortality, coronary artery disease (CAD), and Type 2 diabetes (T2D), even after adjusting for 24 comprehensive clinical and lifestyle factors.
- Implication: The ProteinScores capture independent biological information related to underlying aging and systemic disease risk not found in standard clinical measures, validating the use of multi-protein panels for enhanced personalized risk stratification.
PubMed: 38987645 DOI: 10.1038/s43587-024-00655-7 Overview generated by: Gemini 2.5 Flash, 28/11/2025
Key Findings: Protein Scores Predict Incident Disease and Mortality
This large-scale study leveraged proteomics data from the UK Biobank (n=47,600) to identify and validate protein biomarkers for the risk of 23 common age-related diseases and all-cause mortality. The core finding is that multi-protein scores (“ProteinScores”) significantly enhance the prediction of incident diseases and mortality, even when accounting for comprehensive clinical and lifestyle information.
Discovery of Associations
- The study reported 3,209 associations between 963 unique plasma protein levels and 21 incident outcomes (including diseases and mortality).
- Cardiovascular disease (CVD), specifically coronary artery disease (CAD) and atrial fibrillation (AF), had the largest number of associated proteins, highlighting the strong systemic link between the proteome and heart health.
Predictive Power of ProteinScores
- ProteinScores were developed using penalized Cox regression (elastic net) to combine multiple protein measurements into a single risk score for each outcome.
- These scores were applied to independent test sets and were found to improve the Area Under the Curve (AUC) estimates for the 10-year onset of incident outcomes, beyond a minimally adjusted model that already included age, sex, and a comprehensive set of 24 lifestyle and clinical factors.
- ProteinScores were validated for six major outcomes, including:
- All-cause mortality
- Coronary artery disease (CAD)
- Atrial fibrillation (AF)
- Type 2 diabetes (T2D)
- Colorectal cancer (CRC)
- Glaucoma
Methods and Design
Cohort and Data
- Participants: Up to 47,600 individuals from the UK Biobank.
- Proteome: Measured levels for 1,468 plasma proteins using the Olink Proximity Extension Assay (PEA) platform.
- Outcomes: Incident diagnoses for 23 age-related diseases and all-cause mortality, monitored over a 10-year follow-up period.
Statistical Modeling
- Individual Protein Associations: Cox proportional hazards (PH) models were used to test the association between each individual protein and each outcome.
- ProteinScore Development: Penalized Cox regression (elastic net) was employed across 50 randomized iterations of training/testing to select and weight the most predictive proteins for each outcome, resulting in a single, robust ProteinScore.
- Benchmarking: ProteinScore performance was rigorously compared to a minimally adjusted model (age, sex, and 24 clinical/lifestyle factors) using the incremental AUC difference (\(\Delta\)AUC), with 10-year AUC being the primary metric.
Results: Biological Insights and Clinical Relevance
Mortality and Aging
The ProteinScore for all-cause mortality was consistently one of the strongest performers. The proteins contributing most to the mortality score were related to fundamental biological pathways such as inflammation (e.g., C-Reactive Protein, CRP) and cellular stress/damage. The study demonstrated that protein levels capture biological information related to underlying aging processes that are independent of standard clinical risk factors.
Disease Specificity
- CAD and AF: ProteinScores for these cardiovascular outcomes showed significant predictive improvement, highlighting the utility of proteomics in risk stratification for heart disease.
- CRC and Glaucoma: The successful prediction of these conditions demonstrates that plasma proteomics can capture systemic signals of diseases that are often viewed as localized.
Conclusions and Recommendations
The study confirms that plasma protein levels are powerful predictors of future health outcomes and mortality. The ProteinScore approach provides a robust, validated, and clinically actionable method for integrating this proteomic information into risk stratification models. The authors advocate for the routine use of multi-protein panels to improve personalized risk assessment for major age-related diseases and overall longevity.