A major application of RNA-Seq is to perform differential gene expression analysis. Many tools exist to analyze differentially expressed genes for datasets with biological and/or technical replicates. However, due to the relatively high cost, many RNA-Seq experiments have no, or very few, replicates.
IsoDE is a software package that can be used to perform differential gene expression analysis for RNA-Seq data both with and without replicates. IsoDE is based on bootstrapping, which provides a principled way to test for differential expression based on fold changes obtained from FPKM estimates obtained by resampling the original read alignments. This strategy can be used in conjunction with any method for estimating individual gene expression levels from aligned RNA-Seq reads; in the IsoDE implementation we rely on the IsoEM algorithm, a scalable expectation-maximization algorithm that takes into account gene isoforms in the inference process to ensure accurate length normalization. Experiments on MAQC RNA-Seq datasets without replicates show that IsoDE has consistently high accuracy as defined by the qPCR ground truth, frequently outperforming existing methods such as Fisher’s exact test, edgeR, GFOLD, and Cuffdiff, particularly at for low coverage data and at lower fold change thresholds. In experiments on MCF-7 RNA-Seq datasets with up to 7 replicates IsoDE also achieved high accuracy that varies smoothly with the number of replicates and is relatively uniform across the entire range of gene expression levels.
IsoDE source code
The software is written in Java so it can be run on any platform with a java virtual machine. See the README.TXT file for installation instructions. The source code is distributed with the installation package.
Sample Ion Torrent MAQC datasets and scripts for performing differential gene expression analysis on these datasets are available at MAQC-samples.zip.
Acknowledgment and Disclaimer
This material is based upon work supported in part by the Agriculture and Food Research Initiative Competitive Grant No. 2011-67016-30331 from the USDA National Institute of Food and Agriculture and awards IIS-0916401 and IIS-0916948 from NSF, a Collaborative Research Grant from Life Technologies, and the Molecular Basis of Disease Area of Focus Georgia State University. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the funding agencies.