What is a Differentially Expressed Gene?

RNA-seq
differential gene expression
bayesian statistics
reproducibility
systems biology
  • Central Problem: Traditional binary classification of Differentially Expressed Genes (DEGs) using p-value and \(\log_{2}\) fold change thresholds suffers from reproducibility issues due to biological and technical variation, especially with low replicate numbers.
  • Findings: The use of rigid \(\log_{2}\) fold change cut-offs leads to false negatives (missing small but significant changes), and inherent gene variability, particularly in small sample sets, leads to false positives (misinterpreting noise as a change).
  • Recommendation: The authors advocate for rank-based methods grounded in Bayesian statistics (using Bayes factors) over traditional tools like DESeq2 and edgeR to reduce reliance on arbitrary thresholds and better communicate the uncertainty of differential expression.
Published

23 January 2026

PubMed: Not Indexed (bioRxiv) DOI: 10.1101/2025.01.31.635902 Overview generated by: Gemini 2.5 Flash, 28/11/2025

Core Problem: Reproducibility in Differential Gene Expression

The concept of ‘Differentially Expressed Genes’ (DEGs) is fundamental to RNA-Seq studies, yet their identification is plagued by reproducibility issues. This is primarily attributed to the inherent biological and technical variation in the data, which is poorly captured when using small numbers of replicates. When rigid thresholds for p-values and \(\log_{2}\) fold changes are applied, this variability can lead to inconsistent results and incomplete data description.

Methods: Comparing Binary vs. Rank-Based Classification

The study uses a published yeast RNA-Seq dataset comprising over 40 replicates (42 wild-type and 44 SNF2-mutant) to compare traditional DEG identification methods with a new rank-based Bayesian framework called bayexpress.

  • Traditional Methods: DESeq2 and edgeR, which classify DEGs using a binary approach based on a p-value cutoff and an absolute \(\log_{2}\) fold change threshold (e.g., \(|log_{2}\) fold change \(|>1\)).
  • Bayesian Method (bayexpress): Uses Bayes factors (\(BF_{21}\) for evidence of change and \(BF_{k1}\) for consistency across replicates) to rank genes based on statistical evidence, thereby communicating uncertainty and reducing reliance on arbitrary thresholds.

Key Results

1. Fold Change Cut-offs Lead to False Negatives

The practice of applying an absolute \(\log_{2}\) fold change cut-off is shown to increase the number of potential false negatives. This occurs because genes with small fold changes but strong statistical evidence (\(BF_{21}\) reports strong evidence) and potentially significant biological consequences are often excluded from analysis by binary thresholding. The authors emphasize that the biological impact of an expression change is not necessarily correlated with its magnitude.

2. Variability Masquerades as Differential Expression (False Positives)

A control experiment comparing wild-type vs. wild-type samples demonstrated that inherent variability in gene expression can be wrongly identified as differential expression, generating false positives. * This effect is particularly pronounced in datasets with a limited number of replicates (e.g., 3 replicates). * The analysis identified a set of “consistently inconsistent” genes (those with high variability across replicates, indicated by a positive \(BF_{k1}\)), which are most prone to this misclassification in studies that lack sufficient replication to robustly assess natural variability.

Conclusions and Recommendations

The findings challenge the current widespread reliance on binary classification criteria for DEG analysis. The study highlights that the choice of thresholds and the number of replicates are major factors contributing to irreproducible results and overlooked genes. The authors advocate for a shift toward rank-based methods and Bayesian statistics (like PIMMS-VAE) to communicate uncertainty and mitigate the limitations of fixed binary thresholds, especially in scenarios with high data variability or small sample sizes.