Algorithms in Bioinformatics
Software MEGAN
 
Welcome
People
Research
Publications
Software
  CGViz
  Copycat
  CrossLink
  Dendroscope
  MEGAN
  MetaSim
  microHARVESTER
  NRPSpredictor
  OSLay
  PAT
  ReadSim
  SplitsTree4
  SplitsTree3.2
  2D Tiler
Talks
Teaching
Bachelor Thesis/ Student Projects
Master Thesis/ Diploma Projects
Studienkommission

Contents
Search
Address

ZBIT
CS Dept.
University
 

MEGAN - Metagenome Analysis Software



MEGAN splash screen

by Daniel Huson and Stephan Schuster,
with contributions from Alexander F. Auch,
Daniel C. Richter, Suparna Mitra and Qi Ji.


Metagenomics
Laptop Analysis
Read Assignment
Publications
Download
Screen casts
Datasets
Presentations


New: Screen cast " Introduction to MEGAN2beta".

Metagenomics

Metagenomics is the study of the genomic content of a sample of organisms obtained from a common habitat using targeted or random sequencing. Goals include understanding the extent and role of microbial diversity
.
Soli Sargasso sea sample sites
Mammoth
http://soils.usda.gov
Poinar et al 2006

The taxonomical content of such a sample is usually estimated by comparison against DNA and protein sequence databases of known sequences. Most published studies employ the analysis of paired-end reads, complete sequences of environmental fosmid and BAC clones, or environmental assemblies. Emerging very-high-throughput sequencing technologies are paving the way to low-cost random shotgun approaches.

MEGAN analysis of mammoth sample

Laptop Analysis

MEGAN (“MEtaGenome ANalyzer”) is a new computer program that allows laptop analysis of large metagenomic datasets. In a preprocessing step, the set of DNA reads (or contigs) is compared against databases of known sequences using BLAST or another comparison tool. MEGAN can then be used to compute and interactively explore the taxonomical content of the dataset, employing the NCBI taxonomy to summarize and order the results.

Metagenomics pipeline and MEGAN

Assignment of Reads to Taxa

An LCA-based algorithm assigns reads to taxa in such a way that the taxonomical level of the assigned taxon reflects the level of conservation of the sampled sequence. The software allows dissection of large datasets without the need for assembly or the targeting of specific phylogenetic markers. It provides graphical and statistical output for the comparison of different data sets. We have sucessfully applied this approach to a number of datasets obtained by Sanger sequencing and sequencing-by-synthesis technology, including the Sargasso Sea dataset, a recently published metagenomic dataset sampled from a mammoth bone, and several complete microbial genomes.

MEGAN analysis of samples

Comparison and Analysis of Multiple Datasets

We are working on a new version of the program, called MEGAN2, that is aimed at facilitiating comparative analyses of datasets.
As a simple example, this pictures shows the analysis of two different sequencing experiments performed on Ecoli and on Bdellovibrio,
using 454 sequencing. It clearly illustrates that MEGAN analysis can distinguish between the two different species, based on a BlastX comparison against NR:

simple comparison between two different datasets


COG analysis

MEGAN2 provides tools for analysing the functional content of a metagenome:

COGs found by MEGAN in cave bear data



Publications

An example of the application of MEGAN can be found in Poinar et al 2006., where we used an early version of our software (called GenomeTaxonomyBrowser) to analyze the taxonomical content of a collection of DNA reads sampled from a mammoth.

See the online advanced publication of our paper entitled "MEGAN Analysis of Metagenomic Data" in Genome Research.

Download

Use of the program requires a license. Academic licenses are freely available to all academic users. Usage in non-academic settings requires a commerical license. Obtain a license key online.

We are currently developing MEGAN2, a new comparative version of MEGAN. Download the latest BETA version here.

(Download the original version 1.0 here.)

To find out more about the program, please take a look at the current user manual.

Screen casts

Introduction to MEGAN2beta:

Advanced features of MEGAN2beta:
coming soon...
Comparative analysis using MEGAN2beta:
coming soon...

Datasets


Here are links to data sets used in our paper "MEGAN Analysis of Metagenome Data" which appeared in Genome Research:

Sargasso Sea Subsample 1
10,000 Sanger reads
BLASTX-NR
MEGAN file
Sargasso Sea Subsample 2-4
10,000 Sanger reads BLASTX-NR MEGAN file
Mammoth dataset
BLASTX-NR MEGAN file
E. coli K12
2000 SBS reads
BLASTX-NR MEGAN file
MEGAN file without K12 hits
Bdellovibrio bacteriovorus  HD100
2000 SBS reads
BLASTX-NR MEGAN file
MEGAN file without HD100 hits

Simulated datasets used in the paper:

E. coli K12, simulated reads ~35 bp
5000 SBS reads
MEGAN file
E. coli K12, simulated reads ~100 bp 5000 SBS reads MEGAN file
E. coli K12, simulated reads ~200 bp 5000 SBS reads MEGAN file
E. coli K12, simulated reads ~800 bp 5000 SBS reads MEGAN file
B. bacteriovorus  HD100, simulated reads ~35 bp
5000 SBS reads MEGAN file
B. bacteriovorus  HD100, simulated reads ~100 bp 5000 SBS reads MEGAN file
B. bacteriovorus  HD100, simulated reads ~200 bp 5000 SBS reads MEGAN file
B. bacteriovorus  HD100, simulated reads ~800 bp 5000 SBS reads MEGAN file

Internal.

Recent presentations


Methods for metagenomic analysis (U Penn, Penn State and Venter Institute, April 2008)



University of Tübingen