Bacteria can be identified, and quantitative information derived, from data determined by sequencing variable regions of the 16S rRNA gene. Large parts of the 16S rRNA gene are conserved within bacteria; however, these sequences are interrupted by 9 variable regions, which can vary extremely between species. The sequence of the variable regions can therefore identify the organism from which the genetic material was extracted.
The advantage of this method is that with little sequencing effort, the bacterial community colonizing a habitat can be determined without the need for cultivation. The method is limited as it only detects bacteria without giving information on other microorganisms, and no functional analysis is possible.
Next-generation sequencing methods are used to determine the DNA sequences. In contrast to Sanger Sequencing, these high-throughput methods can sequence large amounts of DNA simultaneously; therefore, many samples can be examined in parallel, reducing costs and turn-around time.
All samples received are inspected to ensure quality criteria are met. DNA is isolated from samples and the selected variable region(s) amplified and attached to short DNA sequences necessary for sample identification and sequencing. Next-generation sequencing produces millions of reads: low quality data is eliminated, while high quality reads are processed and compared to sequence databases. Taxonomic binning and relative quantification are then performed.
Comparison of the bacterial population colonizing the intestine of lean and obese twins. The proportion of Bacteroidetes was lower and the relative abundance of Actinobacteria higher in obese individuals. (Turnbaugh et al. (2009) Nature)
Development of the intestinal microbiota: Newborns are mainly colonized by Bifidobacteria and the variations between children is greater than between adults. An adult like metagenome is establisched within the first 3 years of life. (Yatsunenko et al. (2012) Nature)
Identification of bacteria associated with colorectal cancer. (Geng et al. (2014) Gut Pathogens)
Identification of pathogens in cystic fibrosis patients, that could not be detected using classical methods. (Salipante et al. (2013) PLOS ONE)
Upon receipt of a sample, the first step is quality control. It is important to verify that the samples were sufficiently cooled and not damaged during transport. Project-specific sample requirements are also checked. DNA is isolated using an established extraction protocol. After validation of DNA quality, it is prepared for sequencing. Amplification and the attachment of barcode and linker sequences are performed according to published methods (Illumina, Caporaso et al.), and the DNA sequence is determined using the Illumina MiSeq. After demultiplexing the data, paired-end reads are merged. Quality control is performed using PrinSeq and the reads are compared to the NCBI Taxonomy. The lowest common ancestor (LCA) algorithm is used for taxonomic binning, and in the last step, a report of the results is created.