Personal tools
You are here: Home Software MEGAN

MEGAN - Metagenome Analysis Software

Download

MEGAN splash screen

 

by Daniel Huson and Stephan Schuster,
with contributions from Alexander F. Auch,
Daniel C. Richter, Suparna Mitra and Qi Ji.


 

Metagenomics

Metagenomics is the study of the genomic content of a sample of organisms obtained from a common habitat using targeted or random sequencing. Goals include understanding the extent and role of microbial diversity

Soli Sargasso sea sample sites
Mammoth
http://soils.usda.gov
Poinar et al 2006

 

The taxonomical content of such a sample is usually estimated by comparison against DNA and protein sequence databases of known sequences. Most published studies employ the analysis of paired-end reads, complete sequences of environmental fosmid and BAC clones, or environmental assemblies. Emerging very-high-throughput sequencing technologies are paving the way to low-cost random shotgun approaches.

 

MEGAN analysis of mammoth sample

Laptop Analysis

MEGAN (“MEtaGenome ANalyzer”) is a new computer program that allows laptop analysis of large metagenomic datasets. In a preprocessing step, the set of DNA reads (or contigs) is compared against databases of known sequences using BLAST or another comparison tool. MEGAN can then be used to compute and interactively explore the taxonomical content of the dataset, employing the NCBI taxonomy to summarize and order the results.

 

Metagenomics pipeline and MEGAN

Assignment of Reads to Taxa

An LCA-based algorithm assigns reads to taxa in such a way that the taxonomical level of the assigned taxon reflects the level of conservation of the sampled sequence. The software allows dissection of large datasets without the need for assembly or the targeting of specific phylogenetic markers. It provides graphical and statistical output for the comparison of different data sets. We have sucessfully applied this approach to a number of datasets obtained by Sanger sequencing and sequencing-by-synthesis technology, including the Sargasso Sea dataset, a recently published metagenomic dataset sampled from a mammoth bone, and several complete microbial genomes.

 

MEGAN analysis of samples

Comparison and Analysis of Multiple Datasets

The latest version of the program, called MEGAN3, that is aimed at facilitiating comparative analyses of datasets. As a simple example, this pictures shows the analysis of two different sequencing experiments performed on Ecoli and on Bdellovibrio, using 454 sequencing. It clearly illustrates that MEGAN analysis can distinguish between the two different species, based on a BlastX comparison against NR:

simple comparison between two different datasets


COG analysis

MEGAN3 provides tools for analysing the functional content of a metagenome:

COGs found by MEGAN in cave bear data

Publications

An example of the application of MEGAN can be found in Poinar et al 2006

, where we used an early version of our software to analyze the taxonomical content of a collection of DNA reads sampled from a mammoth.

 

 See the online publication of our paper entitled "MEGAN Analysis of Metagenomic Data" in Genome Research.

Download

Use of the program requires a license. Academic licenses are freely available to all academic users. Usage in non-academic settings requires a commerical license. Obtain a license key online.

 

NEW! MEGAN 3 supports comparison of multiple datasets and uses a new file format, RMA, that makes it possible process BLAST files that are upto 1 TB is size. Download the latest version here.

 

(Download the original version 1.0 here.)

 

To find out more about the program, please take a look at the current user manual.

Old datasets

 

Here are links to data sets used in our paper "MEGAN Analysis of Metagenome Data" which appeared in Genome Research. Please note that these datasets are for use with MEGAN1.0. We will soon replace them with new files compatible with MEGAN 3.

Sargasso Sea Subsample 1
10,000 Sanger reads
BLASTX-NR
MEGAN file
Sargasso Sea Subsample 2-4
10,000 Sanger reads BLASTX-NR MEGAN file
Mammoth dataset
BLASTX-NR MEGAN file
E. coli K12
2000 SBS reads
BLASTX-NR MEGAN file
MEGAN file without K12 hits
Bdellovibrio bacteriovorus  HD100
2000 SBS reads
BLASTX-NR MEGAN file
MEGAN file without HD100 hits

 
Simulated datasets used in the paper:

E. coli K12, simulated reads ~35 bp
5000 SBS reads
MEGAN file
E. coli K12, simulated reads ~100 bp 5000 SBS reads MEGAN file
E. coli K12, simulated reads ~200 bp 5000 SBS reads MEGAN file
E. coli K12, simulated reads ~800 bp 5000 SBS reads MEGAN file
B. bacteriovorus  HD100, simulated reads ~35 bp
5000 SBS reads MEGAN file
B. bacteriovorus  HD100, simulated reads ~100 bp 5000 SBS reads MEGAN file
B. bacteriovorus  HD100, simulated reads ~200 bp 5000 SBS reads MEGAN file
B. bacteriovorus  HD100, simulated reads ~800 bp 5000 SBS reads MEGAN file

 

Recent presentations


Methods for metagenomic analysis (U Penn, Penn State and Venter Institute, April 2008)

 

Document Actions
« November 2008 »
November
MoTuWeThFrSaSu
12
3456789
10111213141516
17181920212223
24252627282930