Personal tools
You are here: Home Software MEGAN

MEGAN - Metagenome Analysis Software

Software for analyzing metagenomes.

Download

New: MEGAN 3 supports comparison of multiple datasets and uses a new file format, RMA, that makes it possible to process and interactively explore BLAST files up to 1 TB in size.

 

MEGAN splash screen

 

by Daniel Huson and Stephan Schuster,
with contributions from Alexander F. Auch,
Daniel C. Richter, Suparna Mitra and Qi Ji.


 

Metagenomics

Metagenomics is the study of the genomic content of a sample of organisms obtained from a common habitat using targeted or random sequencing. Goals include understanding the extent and role of microbial diversity

Soli Sargasso sea sample sites
Mammoth
http://soils.usda.gov
Poinar et al 2006

 

The taxonomical content of such a sample is usually estimated by comparison against DNA and protein sequence databases of known sequences. Most published studies employ the analysis of paired-end reads, complete sequences of environmental fosmid and BAC clones, or environmental assemblies. Emerging very-high-throughput sequencing technologies are paving the way to low-cost random shotgun approaches.

 

MEGAN analysis of mammoth sample

Laptop Analysis

MEGAN (“MEtaGenome ANalyzer”) is a new computer program that allows laptop analysis of large metagenomic datasets. In a preprocessing step, the set of DNA reads (or contigs) is compared against databases of known sequences using BLAST or another comparison tool. MEGAN can then be used to compute and interactively explore the taxonomical content of the dataset, employing the NCBI taxonomy to summarize and order the results.

 

Metagenomics pipeline and MEGAN

Assignment of Reads to Taxa

An LCA-based algorithm assigns reads to taxa in such a way that the taxonomical level of the assigned taxon reflects the level of conservation of the sampled sequence. The software allows dissection of large datasets without the need for assembly or the targeting of specific phylogenetic markers. It provides graphical and statistical output for the comparison of different data sets. We have sucessfully applied this approach to a number of datasets obtained by Sanger sequencing and sequencing-by-synthesis technology, including the Sargasso Sea dataset, a recently published metagenomic dataset sampled from a mammoth bone, and several complete microbial genomes.

 

MEGAN analysis of samples

Comparison and Analysis of Multiple Datasets

The latest version of the program, called MEGAN3, that is aimed at facilitiating comparative analyses of datasets. As a simple example, this pictures shows the analysis of two different sequencing experiments performed on Ecoli and on Bdellovibrio, using 454 sequencing. It clearly illustrates that MEGAN analysis can distinguish between the two different species, based on a BlastX comparison against NR:

simple comparison between two different datasets


COG analysis

MEGAN3 provides tools for analyzing the functional content of a metagenome using COGs:

COGs found by MEGAN in cave bear data


GO analysis:

 

A comparative analysis of the functional content of metagenome datasets, based on the Gene Ontology, is now available in version 3.7 of MEGAN!

 

Comparative GO Analysis

 

 

GO Analysis

 

Publications

 

MEGAN 1.0 was published in: D.H. Huson, A.F. Auch, Ji Qi and S.C. Schuster, MEGAN Analysis of Metagenomic Data, Genome Research. 17:377-386, 2007.

 

An example of the application of MEGAN can be found in: H. N. Poinar, C. Schwarz, Ji Qi, B. Shapiro, R. D. E. MacPhee, B. Buigues, A. Tikhonov, D. H. Huson, L. P. Tomsho, A. Auch, M. Rampp, W. Miller, S. C. Schuster, Metagenomics to Paleogenomics: Large-Scale Sequencing of Mammoth DNA, Science 311:392-394, 2006, where we used an early version of our software to analyze the taxonomical content of a collection of DNA reads sampled from a mammoth.

 

An example of using MEGAN to analyze RNA sequences from soild can be found here: T. Urich A. Lanzén, Ji Qi, D.H. Huson, C. Schleper and Stephan C. Schuster, Simultaneous Assessment of Soil Microbial Community Structure and Function through Analysis of the Meta-Transcriptome, PLoS ONE 3(6): e2527 doi:10.1371/journal.pone.0002527.


To find out more about the program, please take a look at the current user manual.

Download

 

Use of the program requires a license. Academic licenses are freely available to all academic users. Usage in non-academic settings requires a commerical license. Obtain a license key online.

 

Download the latest version here.

(Download the original version 1.0 here.)

Have a look at our tutorial on how to set BLAST parameters for long/short read sequences.

 

 

How to use MEGAN

 

To analyze a set of reads using MEGAN, proceed as follows.

1) Put all your reads in one fastA file and use BLASTX to compare your reads against the NCBI-nr database. (You can also use BLASTN to compare against NCBI-nt, or other variants). Here are some hints on blasting metagenomic data sets.

2) Concatenate all the resulting blast files into one large file.

3) Once you have the BLAST results, MEGAN has to import the reads and BLAST file to generate its own archive, called an RMA file. The RMA file will only be about 10-20% the size of the

original input files, but will contain all your reads and the best 25 blast matches for each read. For very large datasets, this step may require a lot of memory. In this case, install MEGAN

on a large memory machine (8 GB should suffice even for one terabyte of input data) and then modify the MEGAN startup script as described below to allow MEGAN more memory.

Even for relatively small datasets, running on a high memory machine is recommended as this speeds up the program significantly.

4) The main computational bottle neck of the analysis is the BLAST run. This will usually be performed on a server. We recommend that the initial parsing of the resulting blast files also

be performed on a server, whereas the interactive analysis can then take place on a desktop or laptop.

5) One you have computed an RMA file for your data, this data can be downloaded, e.g. onto a laptop, and then can be explored and analyzed at ease (2 GB of memory recommended).

You will find that MEGAN allows you to open many different datasets at once and produce comparisons of them.

 

Example datasets

 

 
Publication MEGAN RMA file
T. Urich A. Lanzén, Ji Qi, D.H. Huson, C. Schleper and Stephan C. Schuster, Simultaneous Assessment of Soil Microbial Community Structure and Function through Analysis of the Meta-Transcriptome, PLoS ONE 3(6): e2527 2008, doi:10.1371/journal.pone.0002527. RudSoil_vs_lssu_160807.rma (5.7 GB)
Edwards RA, Rodriguez-Brito B, Wegley L, Haynes M, Breitbart M, Peterson DM, Saar MO, Alexander S, Alexander EC Jr, Rohwer F. Using pyrosequencing to shed light on deep mine microbial ecology. BMC Genomics. 2006 Mar 20;7:57 Red.rma (1.7GB)
Tringe SG, von Mering C, Kobayashi A, Salamov AA, Chen K, Chang HW, Podar M, Short JM, Mathur EJ, Detter JC, Bork P, Hugenholtz P, Rubin EM. Comparative metagenomics of microbial communities. Science. 2005 Apr 22;308(5721):554-7. MinnesotaSoil.rma (2.8 GB)

Lo I, Denef VJ, Verberkmoes NC, Shah MB, Goltsman D, DiBartolo G, Tyson GW, Allen EE, Ram RJ, Detter JC, Richardson P, Thelen MP, Hettich RL, Banfield JF. Strain-resolved community proteomics reveals recombining genomes of acidophilic bacteria. Nature. 2007 Mar 29;446(7135):537-41. Epub 2007 Mar 7. AcidMine.rma (4.8 GB)

Gut Microbiome of Mice with Diet-Induced Obesity project at Washington University Mouse_gut_28789_west1.rma (417 MB)
Mouse_gut_28793_west3.rma 545 MB)
Mouse_gut_28795_fatr1.rma (457 MB)
Mouse_gut_28799_carbr1.rma (461 MB)


(Download old example datasets associated with MEGAN 1 from here).

 

Document Actions
« November 2009 »
November
MoTuWeThFrSaSu
1
2345678
9101112131415
16171819202122
23242526272829
30