WCSE 2017
ISBN: 978-981-11-3671-9 DOI: 10.18178/wcse.2017.06.238

Next-generation Sequencing Generated Discrepancy in Abundance Characterization of Complex Microbial Community Compositions: an Error of Bioinformatics Pipeline

Huimin Zhang, Hongkui He, Runjie Cao, Huizhi Tang, Zhizhou Zhang, Anjun Li, Jie Jiang

Abstract— Next generation sequencing on metagenomes produces a lot of valuable biological and biomedical data but still with some errors. For examples, chimeras are basically originated from biological reactions, while taxonomic classification errors are easily resulted from bioinformatics pipelines. In this study the microbial compositions in the starter (Daqu) of Chinese GujingTribute liquor, especially the dominant species or OTUs (operational taxonomic unit), were determined by two approaches, one is the near full length ribosome gene (16S rDNA plus the internal transcribed spacer (ITS)) library sequencing, and another is 16S rDNA V4-V5 region based next generation sequencing approach. The two approaches gave discrepant results for both the prokaryotic microbes and eukaryotic ones. Especially, the results for prokaryotic microbes showed apparent differences in that (1)The most dominant species or OTU belong to different phyla; (2) The 20 most dominant species or OTUs overlapped only partially. Further investigation indicated that the bioinformatics analysis pipeline itself was sometimes an important source for discrepancy generation.

Index Terms— next generation sequencing, bioinformatics pipeline, metagenome, discrepancy

Huimin Zhang, Zhizhou Zhang,Jie Jiang
School of Marine Science and Technology, Marine anti-fouling Engineering Technology Center of Shandong Province, Harbin Institute of Technology, CHINA
Hongkui He, Runjie Cao, Huizhi Tang, Anjun Li
The Anhui GuJingTribute Liquor Ltd, CHINA

ISBN: 978-981-11-3671-9 DOI: 10.18178/wcse.2017.06.17Xsrc="http://www.wcse.org/uploadfile/2019/0823/20190823055609629.png" style="width: 120px; height: 68px;" />[Download]


Cite: Huimin Zhang, Hongkui He, Runjie Cao, Huizhi Tang, Zhizhou Zhang, Anjun Li, Jie Jiang, "Next-generation Sequencing Generated Discrepancy in Abundance Characterization of Complex Microbial Community Compositions: an Error of Bioinformatics Pipeline," Proceedings of 2017 the 7th International Workshop on Computer Science and Engineering, pp. 1373-1378, Beijing, 25-27 June, 2017.