Bioinformatics Forum: Deriving Enzymatic and Taxonomic Signatures of Metagenomes from Short Read Data

Prof. David Horn (School of Physics & Astronomy, Tel Aviv University)
Wednesday, 2.6.2010, 13:30
Taub 701

We propose a method for deriving enzymatic signatures from short read (SR) metagenomic data of unknown species. The SR data are converted to six pseudo-peptide candidates. We search for occurrences of Specific Peptides (SPs) on the latter. SPs are peptides that are indicative of enzymatic function as defined by the Enzyme Commission (EC) nomenclature. Counting their hits, we associate short reads with specific EC categories. The putative peptide counts can then be converted to estimates of numbers of enzymes associated with the given EC categories in the studied metagenome, thus defining its enzymatic spectrum without the need to perform genomic assemblies of short reads. The method is developed and tested on 22 bacteria for which there exist good EC annotations in NCBI. Enzymatic signatures are derived for 3 metagenomes, and their functional profiles are explored. We extend the SP methodology to taxon-specific SPs (TSPs), allowing us to estimate also taxonomic features of metagenomic data from short-reads. Using recent Swiss-Prot data we obtain TSPs for different phyla of bacteria, and different classes of proteobacteria. These allow us to analyze leading taxa content of 4 different metagenomic datasets.

Joint work with Uri Weingart, Erez Persi and Uri Gophna.

