====== Review after the course ====== **MK20200117** * **Misunderstanding of technical term** * Metagenome != microbiome analysis by targeted amplicon sequencing (amplicon analysis of rRNA gene; meta16S/18S; community analysis) * Metagenome = ‘meta’ + ‘genome’; Genome = ‘gen’ + ‘ome’ * Amplicon of rRNA gene = part of single gene, not genome! * Amplicon of rRNA gene of environment = ‘meta’ + rRNA gene, not 'meta' + 'genome' * **We need to improve basic knowledge of students about molecular biology and gene analysis (before genome analysis).** * Course of basics of gene and genome * Course of basics of web tools * It is appropriate to held the same couse "Introduction to Microbial Genomic and Metagenomic Analyses" annually, to update knowledge of genomics and metagenomics * This type of comprehensive course is more desirable than more specfic course of microbiome analysis, which is directly-related to their own research. * During this course for genomics/metagenomics, participants may have learned from more global aspects, and still they can apply what they learned to their amplicon studies. * In depth course might be held in future (but maybe such situation will not happen soon…) * Course of "Microbial Genomic Analyses" (note that without "Introduction of") * Course of "Metagenomic Analyses" (note that without "Introduction of") * Lecture organization was appropriate * Introduction of real studies (SF and MK) * We may emphasize clarification of microbiome amplicon studies and genomic/metagenomic studies more clearly * Introduction of algorithm * Practices of basics (MK and GL) * Introduction of statistics (GL) * Introduction of R * Lecture of knowledge of pangenome and related issues (MA) * Introduction of tools with graphics (MA) * Lecture based on real reasearch about statistical analysis will contribute to improvement of the effect of the course * Such as the ones provided at the beginning course by SF and MK and the one provided by MA * We may update contents of the course (MK, GL) * To avoid situation that participants (who are not motivated enough) get bored and leave the course mid-way, we may improve the contents by either of the following ways. * State at the beginning that they are better to attend through the end. * Improve practice of basics of linux to keep their motivation, whereas more concise introduction of command line. ====== Note ====== **To prepare** * No call for participants!? * Can we fix the number of participants? max 30, because of the number of notebook at cmcc practice room. * Can we select simply those 15 who applies first? * Those with some motivation letter or sentence attached will be regarded primarily. * bowtie * tutorial data download * wifi at the room? **What computer will we use?** * Linux server, soroban * How accounts are created * How accounts info are distributed to participants * For R tutorial (3rd day), use CMCC 1st floor room, with 30 notebook, where Andrés has installed Rstudio on 2019/07. **Questions MK200103** * When network becomes down, what will we do alternative **Necessity of total hours** * 64hr = 8 days x 8hr * Instead of 8 days, we will do 9 days, with the first and the last day of half days (i.e. 4hr), and 7-days for 8hr. * Lab work e.g. 5hr **Time course of 1 day** * Morning: 9:00-13:00, 4hr * Afternoon: 14:30-18:30, 4hr * In total, 8hr / day * 1h45min-30min break-1h45min / 4hr **Room reservation** * depend on the number of participants * Windows computer at CMCC computer room **About document** * Share documents, MK to Giovanni, ====== Program / Programación ====== ===== Day 1. ===== Day 1. Monday 6 14:30-18:30 (4hr) KAWAI, FUJIYOSHI Lecture 1-1. (60-min) MK * Opening remarks * Outline of this course * Brief review of introduction to Genomic, Metagenomic and Transcriptomic Analyses MK* Lecture 1-2. (60-min) SF * Microbial community analysis (Amplicon analysis, Metagenomics)SF* Lecture 2. (120-min) MK * Statistical mind to find more concrete association with phenomena * What is your question / How to test your question / Importance of experimental design ===== Day 2. ===== Day 2. Tuesday 7 9:00 ÁVILA Lecture 3. (120-min) AA * Sequencers, sequencing platform (Illumina, long read) * Basic data format of basic sequence data (FASTA, GENBANK) * Basic data format of 'next-generation' sequence (NGS) data (FASTQ) * Quality information of positions of reads Lab session 1. (120-min) Lecture 4. 14:30 (120-min) AA * Sequence alignment, sequence similarity search * Two basic approaches, mapping and assembly Lab session 2. (120-min) ===== Day 3. ===== Day 3. Wednesday 8 9:00 KAWAI, LARAMA Practice 1. (120-min) GL * UNIX basic command, to sniff around sequence files (cd, pwd, ls etc.) * Examine file contents of MiSeq * File organization, concept of Path Practice 2. (120-min) GL * Text processing (grep, less, tail, head,..) * To get familiar with NGS sequences #1 (examine NGS sequence files) * Check contents of fastq Practice 3. (120-min) MK * Text processing (wc, cut, sed, awk, redirect (> , >> ) etc.), gzip * To get familiar with NGS sequences #2 (sequential commands by pipe) * count how many of reads of fastq Practice 4. (120-min) MK * Shell script (Batch processing) * Editor (vim) ===== Day 4. ===== Day 4. Thursday 9 9:00 KAWAI, LARAMA Practice 6. (120-min) GL * R statistics and graphics language #1 Practice 7. (120-min) GL * R statistics and graphics language #2 * Goal: * Do (small) analysis and make report of statistical test (apa format) Lab session 3. (240-min) ===== Day 5. ===== Day 5. Friday 10 9:00 KAWAI, LARAMA, ÁVILA Lecture 6. (120-min) GL * Quality control of raw sequence data, basic tools for sequence and NGS data * Commands for NGS analyses * Commands for sequence analyses * Brief review of online resources / web services * NGSToolkit (instead of FastQC) Practice 8. (120-min) MK * To get familiar with NGS sequences #3 * make your own pipeline for custom analysis and record of analysis* screen command* Sequence similarity search against a genome at hand (Do BLAST search locally) Lab XX (240-min) ===== Day 6. ===== Day 6. Monday 13 9:00 ÁVILA , (LARAMA) Lecture 7 and Practice 9. (240-min) * Public sequence database * Local mirror of databases * Usage of information of public database * e.g. Mapping (bowtie2) Lecture 8 and Practice 10. (240-min) * Genome assembly ===== Day 7. ===== Day 7. Tuesday 14 9:00 KAWAI Lecture 9 and Practice 11. (120-min) MK \\ * Protein-coding gene prediction, RNA-coding gene prediction, gene annotation → MK * Muliple sequence alignment * protein domain search Lecture 10 and Practice 12. (120-min) MK * Gene contents of genome * Concept of ortholog * Concept of conserved single-copy genes Lab session 2. (240-min) ===== Day 8. ===== Day 8. Wednsday 15 9:00 - 13:00ABANTO Lecture 10 and Practice 12. (240-min) * Introduction of analyses * Pangenome * Phylogenetic tree Lab session 2. (240-min) ===== Day 9. ===== Day 9. Thursday 16 9:00 KAWAI, LARAMA Lab session 2. (240-min) * Discuss plans * groups work * proposal * paper introducce * apply what they learned Internal meeting (120-min)