Bioinformatics Tools for Viral Quasispecies Reconstruction from Next-Generation Sequencing Data and Vaccine Optimization
Funding agency: National Institute of Food and Agriculture
Grant #: 2011-67016-30331
Amount: $420,000
PI: Ion I. Mandoiu
Co-PIs: Mazhar Khan, Rachel O’Neill, Alex Zelikovsky
Period: 4/2011-3/2014
Abstract:
Viral infections cause a significant burden on animal health, reducing yields and increasing production costs due to expensive control programs. Vaccination is a vital part of such control programs; however, its effectiveness is reduced by the quick evolution of virus variants, called virus quasispecies, in animal hosts. Despite experimental evidence that viral quasispecies play a major role in disease progression and emergence of drug or vaccine resistant variants, practical implications of quasispecies evolution remain poorly understood due to the difficulty to characterize sequence variants and their frequencies in infected animals.
The overarching goal of this project is to develop computational methods enabling comprehensive characterization of genomic diversity of viral quasispecies infecting animal populations based on next-generation sequencing data. Recent advances in sequencing technologies have made it possible to generate millions of short sequence fragments from complex viral samples. However, identification of viral quasispecies from such data requires the development of novel computer algorithms that can accurately piece together the short fragments generated by next-generation sequencing machines. The specific aims of the project are to (1) develop and validate bioinformatics tools for accurate reconstruction of viral quasispecies sequences and their frequencies from next-generation sequencing data; (2) measure quasispecies persistence and evolution in commercial layer flocks following administration of modified live Infectious Bronchitis Virus (IBV) vaccine using; and (3) develop predictive models and algorithms for optimizing strategies of administration of modified live IBV vaccine to commercial layer flocks.
Expected outcomes of the project include the development of a comprehensive algorithmic toolkit for quasispecies sequence reconstruction and frequency estimation from next-generation sequencing data, and user-friendly web-based bioinformatics tools made available free of charge to the research community. We will also conduct four longitudinal sequencing studies of pooled tracheal swab samples collected from layer flocks that are administered attenuated live IBV vaccine. Sequencing will be performed using the 454 and SOLiD platforms to take advantage of their complementary strengths in terms of read length and coverage depth. Sequencing data will be made publicly available to enable further analysis and methods development by other groups. Anticipated benefits from the successful completion of the project include improved diagnostic and monitoring of viral outbreaks in animal populations, reduced costs of vaccination, improved
animal health, and improved yield of animal production.
Software Packages
- ViSpA: viral spectrum assembler from shotgun NGS reads
- VirA: viral quasispecies assembly from overlapping amplicon reads
- kGEM : maximum likelihood k-clustering of viral sequences
- Combinatorial pooling design for sequencing of heterogeneous viral populations
- IsoEM: Inferring Alternative Splicing Isoform Frequencies from High-Throughput RNA-Seq Data
- IsoDE: Bootstrapping-based differential gene expression analysis for RNA-Seq data with and without replicates
- MaLTA: transcriptome assembly and quantification from Ion Torrent RNA-Seq reads
- SILP2 – ILP-based Maximum Likelihood Genome Scaffolding
- Epi-Seq: Bioinformatics pipeline for predicting tumor specific epitopes from RNA-Seq data