Personal tools
You are here: Home Teaching Winter Semester 2008/09 Bioinformatics Tools

Bioinformatics Tools

Diploma students: 4 SWS,  examinable: 2 SWS, MSc Students: 8 LP (Modules: Bioinformatik or Praktische Bioinformatik)
Topic Bioinformatics Software Tools
Teachers:
Daniel Huson  and Suparna Mitra
Time:
30 March-10 April, 9:00-17:00
Signup meeting
Thursday, Oct 23, 17h, C311
Pre-lab meeting
Friday, March 27, 16-18h, C311
Final due date for all materials
April 24.
Credits:
Location: Computer lab C311
Material and Downloads: Introductory papers by email.
Marine datasets from local nfs disk.
MEGAN, MetaSim

Description

The aim of this course is to learn how to analyze metagenomic datasets. The three main questions that biologists hope to answer using computational techniques are: (1) What is the taxonomical profile of my sample? (2) What is the functional profile of my sample? (3) How do two samples differ?

The structure of the course is as follows:
- Introduction to metagenomics analysis using MEGAN.
- Metagenomic analysis of 8 Marine datasets using different techniques.
- Comparison of datasets and of the results obtained using different techniques.

Two weeks before the beginning of the course, each participant will be given a number of papers to study in preparation of the course. Each participant will be expected to give a 20 minute presentation on the content of specific papers.

During the actual course, which will run for two weeks, participants will work on their projects daily in the computer lab in C311. In addition to the lab work, participants will be expected to prepare short presentations on different topics related to the project.

After the course, participants will be expected to finish projects and to write a five-ten page report on the course. This report will be based on a lab logbook that each participant is expected to maintain.

Requirements for admission

This course is for MSc and Diploma students only (sorry, no BSc students).  The lectures "Algorithms in Bioinformatics I and II" or "Bioinformatics I and II" are recommended. Knowledge of  Java and a scripting language (Python, Perl, bash, ...) is required.

Grading

As this is the first time that students will receive grades for this course, the grading system will be developed within the course.

Course language

The teaching language is English.

Credits for this course

Diploma students: To obtain a "Schein" for this course, you are required to successfully complete all  parts of the course. 
Master students:  you are required to successfully complete all  parts of the course. You will be graded on your initial presentation, your participant in the course, other presentations and your final report.

Participants

Till Helge Hedwig, Dominikus Krüger, Paul Rupek, Mario Stärk, Annette Treichel, Christian Zielke, Julian Zipperer

 Schedule

 

 Date Activities
30.03.09
Morning: Presentations by students on introductory papers.
Afternoon: Presentations by students on introductory papers. Download of 8 Marine datasets. Extraction of first 10,000 reads from each dataset. Launch of BLASTX against NR on all datasets. Installation of MEGAN and MetaSim. Read MetaSim paper.
Hand in: Presentation
 31.03.09
Morning: Transform cDNA data from GenBank format to FastA. Launch BLASTX on remaining datasets. Run MEGAN on Marine DNA datasets. Solve tasks: for each dataset, perform analyses of taxa and of function that parallel the ones in the Marine paper. Explore use of different GO-slims.
Afternoon: Compare all datasets and then compare the comparison with the one reported in the paper.
Hand in: Code for converting GenBank to FastA, code for grabbing first 10,000 reads
 01.04.09
Morning: Write a 2 page report on the comparison of the Marine DNA datasets as analysed using MEGAN with the results reported in the Marine paper. Explain why it is difficult to compare MEGAN's functional analysis with the one reported in the paper.
Afternoon:
Study JGI paper on simulated Metagenome datasets. Design LC, MC and HC simulations on MetaSim. Simulate reads and launch BLASTX against NR runs. Produce CSV files for MEGAN to enable comparison of result against "truth".
Hand in: Report on comparison
02.04.09
Morning: Compile a list of all online resources for performing metagenomic analyses. Launch analysis on small test sets extracted from the LC, MC and HC datasets. Where feasible, perform analyses of full LC, MC and HC datasets.
Afternoon:  'Introduction to Geneious' by Melanie Hayr. Use Geneious to evaulate the performance of the LCA heuristic for taxonomical placement by comparing against phylogenetic placement.
Work on performance evaluation of MEGAN on LC, MC and HC datasets.
Hand in: Presentation on Marine dataset results.
 03.04.09 Morning: Analyze results on simulated datasets. How to evaluate the results?
Afternoon: Work on investigating a phylogenetic alternative to the LCA algorithm. Setup timed comparison runs of BLAST and MEGAN using nfs vs using local disks. Launch additional simulations on LC, MC and HC datasets so that we have results for: Sanger sequencing, 454 sequencing, Solexa sequencing.
06.04.09
Morning:Compare all cDNA datasets and then compare the comparison with the one reported in the paper.Write 1 page report on the comparison of the Marine cDNA datasets as analyzed using MEGAN with the results reported in the Marine paper, together with the DNA datasets also in both taxonomic and functional aspect.
Afternoon: Complete the report. and Hand in.
 07.04.09 Morning: Complete performance evaluation of MEGAN on LC, MC and HC datasets.
Compare MEGAN and MG-RAST results depending on 4 DNA and 4 cDNA datasets.
Afternoon:Prepare the presentation for tomorrow.
Hand in: A short report on MEGAN and MG-RAST results.
 08.04.09  Morning: Presentations of assigned topics
Afternoon: Use Geneious to study phylogenetic improvement of LCA algorithm
Hand in: Presentations, discussion of phylogenetic method
 09.04.09  Morning: Write ORF finder for prokayotic genes
Afternoon: Run ORF finder on different metagenome datasets. Compare predicted ORFs on assigned reads vs "No hits". Compare performance for different sequencing technologies.
Hand in: Code and comparisons.

 

After completion of the the lab dates, each participant is expected to write a 10-15 page report. This report should be structured by the days of the course. For each topic studied in the course, please give a brief introduction to the topic, then described what computations and analyses were performed. Perhaps most importantly, provide a discussion of each topic. Also, please write a section on problems with current approaches to metagenome analysis and provide some ideas on how to improve analysis techniques.

 

Due date for this is: April 24th.

Document Actions
« November 2009 »
November
MoTuWeThFrSaSu
1
2345678
9101112131415
16171819202122
23242526272829
30