The DGE-EM package can be used to infer gene expression levels from 3′-tag Digital Gene Expression (DGE) data. DGE-EM uses a novel expectation-maximization algorithm that takes into account alternative splicing isoforms and tags that map at multiple locations in the genome, and corrects for incomplete digestion and sequencing errors. Experimental results on real DGE data generated from reference RNA samples show that our algorithm outperforms commonly used estimation methods based on unique tag counting as well as estimates obtained from RNA-Seq data for the same samples. Results of a comprehensive simulation study assessing the effect of various experimental parameters suggest that further improvements in estimation accuracy could be achieved by optimizing protocol parameters such as the anchoring enzymes and digestion probability.
DGE-EM source code
The software is written partly in Java and partly in Scala so it can be run on any platform with a Java Virtual Machine. The installer is a jar file created with IzPack. See the README.TXT file for installation instructions. The source code is distributed with the installation package.
Acknowledgment and Disclaimer
This material is based upon work supported in part by the National Science Foundation under Grants No. IIS-0546457 and IIS-0916948. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.