Bioinformatics Tools Enabling Large-Scale DNA Barcoding
Funding agency: National Science Foundation, Division of Biological Infrastructure
Award #: DBI-0543365
PI: Ion I. Mandoiu, Co-PIs: Bhaskar DasGupta and Mazhar Khan
Advances in high-throughput genomic technologies promise to revolutionize 21st century taxonomy by making possible automated identification of the estimated 20 million of species living on Earth based on rapid DNA analysis. Fulfilling this promise raises the challenge of developing a standardized analysis platform that provides comprehensive species identification at very low cost per experiment. The goal of this research project is to develop novel bioinformatics tools and design methodologies enabling large-scale species identification based on innovative uses of universal DNA arrays. The work will lead to (1) efficient combinatorial algorithms for several difficult optimization problems arising in the design of large-scale identification assays, (2) optimization of two proven genomic technologies (multiplex-PCR and universal DNA tag arrays), (3) the development of a novel high-throughput genomic assay, (4) methodologies for integrating the above optimizations in a coherent flow enabling the design of scalable species identification assays, and (5) robust open source bioinformatics tools based on developed optimization algorithms, released free of charge for research, academic and non-profit purposes. By enabling large-scale genomic-based species identification, the research will have far reaching impacts in areas ranging from medicine and agriculture to biodiversity research and biodefense. Additional broader impacts include improving scientific infrastructure by timely dissemination of open source software tools and benchmark datasets as a platform for future research, course and curriculum development, mentoring of undergraduate and graduate students, and promoting diversity in research activities.