Post-transcriptional regulation across human tissues
- Core Finding: Scaled mRNA levels predict overall mean protein abundance across different genes (mean-level variability) but are poor predictors of the same protein’s level across different tissues (across-tissues variability).
- Statistical Insight: The overall high mRNA-protein correlation (\(R_T\)) is misleading, as it represents an instance of Simpson’s paradox where the strong inter-gene variability masks the weak intra-gene/across-tissue correlation (\(R_P\)).
- Conclusion: The reproducible, concerted variability in protein-to-mRNA ratios across tissues confirms that post-transcriptional regulation is a substantial and tissue-specific factor, likely contributing approximately 50% of the across-tissues protein variance.
PubMed: 28481885 DOI: 10.1371/journal.pcbi.1005535 Overview generated by: Gemini 2.5 Flash, 10/12/2025
Research Goal and Study Design
The primary goal of this research was to estimate the relative contributions of transcriptional and post-transcriptional regulation in shaping tissue-type-specific proteomes across human tissues. The study addressed the long-standing contestation over whether protein levels are primarily set by mRNA levels or by other regulatory mechanisms.
Distinguishing Variability Sources
A core methodological distinction was made between two orthogonal sources of protein variability in the data, which often become conflated in correlation analyses: 1. Mean-Level Variability: The differences in the mean abundance of different proteins (e.g., highly abundant ribosomal proteins vs. less abundant signaling proteins). Scaled mRNA levels were found to account for most of this variability. 2. Across-Tissues Variability: The physiological differences in the abundance of the same protein across different tissue types. This variability, though smaller in magnitude, is critical for defining tissue identity.
Methods and Data
The study performed a statistical analysis using large cohorts of mRNA (RNA-seq) and protein (shotgun mass spectrometry) data measured across 12 different human tissues.
- Data Reliability Assessment: The authors rigorously estimated the reliability of relative mRNA and protein quantification. They found that low reliability, especially across studies, limited the accurate quantification of regulatory mechanisms for individual proteins, indicating that much of the noise was study-dependent.
- Consensus Dataset: To improve data quality, a consensus protein dataset was generated by appropriately combining data from independent mass spectrometry studies, resulting in estimates with increased reliability.
- Protein-to-mRNA (PTR) Ratios: The relative protein-to-mRNA ratio (rPTR) was used as a measure of post-transcriptional regulation to quantify the variability of functional gene sets across tissues.
Key Findings and Simpson’s Paradox
The analysis highlighted a crucial statistical nuance, demonstrating how misleading total correlation values can be in this context.
The Simpson’s Paradox Illustration
- The overall correlation (\(R_T\)) between scaled mRNA and absolute protein levels across all genes and tissues was high (\(R_T^2 \approx 0.70\)).
- However, this correlation was primarily driven by the large mean-level variability between different proteins.
- When examining the across-tissues variability for any single gene (within-gene correlation, \(R_P\)), the correlation was often low or near zero.
- This discrepancy is an example of Simpson’s paradox, where a large overall trend (high \(R_T\)) masks the true, opposite trend within subgroups (low \(R_P\) for individual genes across tissues).
Evidence for Post-transcriptional Regulation
- Weak Predictive Power: For any single gene, its protein levels across tissues were found to be poorly predicted by its corresponding mRNA levels, suggesting tissue-specific post-transcriptional regulation.
- Functional Concertion: The analysis of relative PTR (rPTR) showed substantial across-tissues variability that was functionally concerted and reproducible across independent datasets, which further supports the existence of extensive post-transcriptional control.
- Reliability-Corrected Estimates: After correcting for measurement noise and low data reliability, the results indicated that approximately 50% of the across-tissues protein variance could be attributed to transcriptional regulation, and approximately 50% was due to post-transcriptional regulation.
Conclusions and Recommendations
The study concludes that post-transcriptional regulation is a significant contributor to shaping tissue-type-specific proteomes in humans.
- The results caution researchers against estimating protein fold-changes from mRNA fold-changes between different cell types.
- It is critical to avoid conflating different sources of variability; the high correlation between absolute protein and mRNA levels should not be used to infer the degree of post-transcriptional regulation of a specific protein across different tissues.
- The work underscores the fact that cell-type differentiation and commitment involve substantial post-transcriptional remodeling.