Review after the course
MK20200117
- Misunderstanding of technical term
- Metagenome != microbiome analysis by targeted amplicon sequencing (amplicon analysis of rRNA gene; meta16S/18S; community analysis)
- Metagenome = ‘meta’ + ‘genome’; Genome = ‘gen’ + ‘ome’
- Amplicon of rRNA gene = part of single gene, not genome!
- Amplicon of rRNA gene of environment = ‘meta’ + rRNA gene, not 'meta' + 'genome'
- We need to improve basic knowledge of students about molecular biology and gene analysis (before genome analysis).
- Course of basics of gene and genome
- Course of basics of web tools
- It is appropriate to held the same couse “Introduction to Microbial Genomic and Metagenomic Analyses” annually, to update knowledge of genomics and metagenomics
- This type of comprehensive course is more desirable than more specfic course of microbiome analysis, which is directly-related to their own research.
- During this course for genomics/metagenomics, participants may have learned from more global aspects, and still they can apply what they learned to their amplicon studies.
- In depth course might be held in future (but maybe such situation will not happen soon…)
- Course of “Microbial Genomic Analyses” (note that without “Introduction of”)
- Course of “Metagenomic Analyses” (note that without “Introduction of”)
- Lecture organization was appropriate
- Introduction of real studies (SF and MK)
- We may emphasize clarification of microbiome amplicon studies and genomic/metagenomic studies more clearly
- Introduction of algorithm
- Practices of basics (MK and GL)
- Introduction of statistics (GL)
- Introduction of R
- Lecture of knowledge of pangenome and related issues (MA)
- Introduction of tools with graphics (MA)
- Lecture based on real reasearch about statistical analysis will contribute to improvement of the effect of the course
- Such as the ones provided at the beginning course by SF and MK and the one provided by MA
- We may update contents of the course (MK, GL)
- To avoid situation that participants (who are not motivated enough) get bored and leave the course mid-way, we may improve the contents by either of the following ways.
- State at the beginning that they are better to attend through the end.
- Improve practice of basics of linux to keep their motivation, whereas more concise introduction of command line.
Note
To prepare
No call for participants!?- Can we fix the number of participants? max 30, because of the number of notebook at cmcc practice room.
- Can we select simply those 15 who applies first?
- Those with some motivation letter or sentence attached will be regarded primarily.
- bowtie
- tutorial data download
- wifi at the room?
What computer will we use?
- Linux server, soroban
- How accounts are created
- How accounts info are distributed to participants
- For R tutorial (3rd day), use CMCC 1st floor room, with 30 notebook, where Andrés has installed Rstudio on 2019/07.
Questions MK200103
- When network becomes down, what will we do alternative
Necessity of total hours
- 64hr = 8 days x 8hr
- Instead of 8 days, we will do 9 days, with the first and the last day of half days (i.e. 4hr), and 7-days for 8hr.
- Lab work e.g. 5hr
Time course of 1 day
- Morning: 9:00-13:00, 4hr
- Afternoon: 14:30-18:30, 4hr
- In total, 8hr / day
- 1h45min-30min break-1h45min / 4hr
Room reservation
- depend on the number of participants
- Windows computer at CMCC computer room
About document
- Share documents, MK to Giovanni,
Program / Programación
Day 1.
Day 1. Monday 6 14:30-18:30 (4hr)
KAWAI, FUJIYOSHI
Lecture 1-1. (60-min) MK
- Opening remarks
- Outline of this course
- Brief review of introduction to Genomic, Metagenomic and Transcriptomic Analyses MK*
Lecture 1-2. (60-min) SF
- Microbial community analysis (Amplicon analysis, Metagenomics)SF*
Lecture 2. (120-min) MK
- Statistical mind to find more concrete association with phenomena
- What is your question / How to test your question / Importance of experimental design
Day 2.
Day 2. Tuesday 7 9:00 ÁVILA
Lecture 3. (120-min) AA
- Sequencers, sequencing platform (Illumina, long read)
- Basic data format of basic sequence data (FASTA, GENBANK)
- Basic data format of 'next-generation' sequence (NGS) data (FASTQ)
- Quality information of positions of reads
Lab session 1. (120-min)
Lecture 4. 14:30 (120-min) AA
- Sequence alignment, sequence similarity search
- Two basic approaches, mapping and assembly
Lab session 2. (120-min)
Day 3.
Day 3. Wednesday 8 9:00 KAWAI, LARAMA
Practice 1. (120-min) GL
- UNIX basic command, to sniff around sequence files (cd, pwd, ls etc.)
- Examine file contents of MiSeq
- File organization, concept of Path
Practice 2. (120-min) GL
- Text processing (grep, less, tail, head,..)
- To get familiar with NGS sequences #1 (examine NGS sequence files)
- Check contents of fastq
Practice 3. (120-min) MK
- Text processing (wc, cut, sed, awk, redirect (> , » ) etc.), gzip
- To get familiar with NGS sequences #2 (sequential commands by pipe)
- count how many of reads of fastq
Practice 4. (120-min) MK
- Shell script (Batch processing)
- Editor (vim)
Day 4.
Day 4. Thursday 9 9:00 KAWAI, LARAMA
Practice 6. (120-min) GL
- R statistics and graphics language #1
Practice 7. (120-min) GL
- R statistics and graphics language #2
- Goal: * Do (small) analysis and make report of statistical test (apa format)
Lab session 3. (240-min)
Day 5.
Day 5. Friday 10 9:00 KAWAI, LARAMA, ÁVILA
Lecture 6. (120-min) GL
- Quality control of raw sequence data, basic tools for sequence and NGS data
- Commands for NGS analyses
- Commands for sequence analyses
- Brief review of online resources / web services
- NGSToolkit (instead of FastQC)
Practice 8. (120-min) MK
- To get familiar with NGS sequences #3
- make your own pipeline for custom analysis and record of analysis* screen command* Sequence similarity search against a genome at hand (Do BLAST search locally)
Lab XX (240-min)
Day 6.
Day 6. Monday 13 9:00 ÁVILA , (LARAMA)
Lecture 7 and Practice 9. (240-min)
- Public sequence database
- Local mirror of databases
- Usage of information of public database
- e.g. Mapping (bowtie2)
Lecture 8 and Practice 10. (240-min)
- Genome assembly
Day 7.
Day 7. Tuesday 14 9:00 KAWAI
Lecture 9 and Practice 11. (120-min) MK
* Protein-coding gene prediction, RNA-coding gene prediction, gene annotation → MK
- Muliple sequence alignment
- protein domain search
Lecture 10 and Practice 12. (120-min) MK
- Gene contents of genome
- Concept of ortholog
- Concept of conserved single-copy genes
Lab session 2. (240-min)
Day 8.
Day 8. Wednsday 15 9:00 - 13:00ABANTO
Lecture 10 and Practice 12. (240-min)
- Introduction of analyses
- Pangenome
- Phylogenetic tree
Lab session 2. (240-min)
Day 9.
Day 9. Thursday 16 9:00 KAWAI, LARAMA
Lab session 2. (240-min)
- Discuss plans
- groups work
- proposal
- paper introducce
- apply what they learned
Internal meeting (120-min)