namespace:doc-mach:preparation-course-bioinfo2020

Review after the course

MK20200117

  • Misunderstanding of technical term
    • Metagenome != microbiome analysis by targeted amplicon sequencing (amplicon analysis of rRNA gene; meta16S/18S; community analysis)
    • Metagenome = ‘meta’ + ‘genome’; Genome = ‘gen’ + ‘ome’
    • Amplicon of rRNA gene = part of single gene, not genome!
    • Amplicon of rRNA gene of environment = ‘meta’ + rRNA gene, not 'meta' + 'genome'
  • We need to improve basic knowledge of students about molecular biology and gene analysis (before genome analysis).
    • Course of basics of gene and genome
    • Course of basics of web tools
  • It is appropriate to held the same couse “Introduction to Microbial Genomic and Metagenomic Analyses” annually, to update knowledge of genomics and metagenomics
    • This type of comprehensive course is more desirable than more specfic course of microbiome analysis, which is directly-related to their own research.
      • During this course for genomics/metagenomics, participants may have learned from more global aspects, and still they can apply what they learned to their amplicon studies.
    • In depth course might be held in future (but maybe such situation will not happen soon…)
      • Course of “Microbial Genomic Analyses” (note that without “Introduction of”)
      • Course of “Metagenomic Analyses” (note that without “Introduction of”)
  • Lecture organization was appropriate
    • Introduction of real studies (SF and MK)
      • We may emphasize clarification of microbiome amplicon studies and genomic/metagenomic studies more clearly
    • Introduction of algorithm
    • Practices of basics (MK and GL)
    • Introduction of statistics (GL)
    • Introduction of R
    • Lecture of knowledge of pangenome and related issues (MA)
    • Introduction of tools with graphics (MA)
  • Lecture based on real reasearch about statistical analysis will contribute to improvement of the effect of the course
    • Such as the ones provided at the beginning course by SF and MK and the one provided by MA
  • We may update contents of the course (MK, GL)
  • To avoid situation that participants (who are not motivated enough) get bored and leave the course mid-way, we may improve the contents by either of the following ways.
    • State at the beginning that they are better to attend through the end.
    • Improve practice of basics of linux to keep their motivation, whereas more concise introduction of command line.

Note

To prepare

  • No call for participants!?
  • Can we fix the number of participants? max 30, because of the number of notebook at cmcc practice room.
  • Can we select simply those 15 who applies first?
  • Those with some motivation letter or sentence attached will be regarded primarily.
  • bowtie
  • tutorial data download
  • wifi at the room?

What computer will we use?

  • Linux server, soroban
  • How accounts are created
  • How accounts info are distributed to participants
  • For R tutorial (3rd day), use CMCC 1st floor room, with 30 notebook, where Andrés has installed Rstudio on 2019/07.

Questions MK200103

  • When network becomes down, what will we do alternative

Necessity of total hours

  • 64hr = 8 days x 8hr
  • Instead of 8 days, we will do 9 days, with the first and the last day of half days (i.e. 4hr), and 7-days for 8hr.
  • Lab work e.g. 5hr

Time course of 1 day

  • Morning: 9:00-13:00, 4hr
  • Afternoon: 14:30-18:30, 4hr
  • In total, 8hr / day
  • 1h45min-30min break-1h45min / 4hr

Room reservation

  • depend on the number of participants
  • Windows computer at CMCC computer room

About document

  • Share documents, MK to Giovanni,

Program / Programación

Day 1. Monday 6 14:30-18:30 (4hr)

KAWAI, FUJIYOSHI

Lecture 1-1. (60-min) MK

  • Opening remarks
  • Outline of this course
  • Brief review of introduction to Genomic, Metagenomic and Transcriptomic Analyses MK*

Lecture 1-2. (60-min) SF

  • Microbial community analysis (Amplicon analysis, Metagenomics)SF*

Lecture 2. (120-min) MK

  • Statistical mind to find more concrete association with phenomena
  • What is your question / How to test your question / Importance of experimental design

Day 2. Tuesday 7 9:00 ÁVILA

Lecture 3. (120-min) AA

  • Sequencers, sequencing platform (Illumina, long read)
  • Basic data format of basic sequence data (FASTA, GENBANK)
  • Basic data format of 'next-generation' sequence (NGS) data (FASTQ)
  • Quality information of positions of reads

Lab session 1. (120-min)

Lecture 4. 14:30 (120-min) AA

  • Sequence alignment, sequence similarity search
  • Two basic approaches, mapping and assembly

Lab session 2. (120-min)

Day 3. Wednesday 8 9:00 KAWAI, LARAMA

Practice 1. (120-min) GL

  • UNIX basic command, to sniff around sequence files (cd, pwd, ls etc.)
  • Examine file contents of MiSeq
  • File organization, concept of Path

Practice 2. (120-min) GL

  • Text processing (grep, less, tail, head,..)
  • To get familiar with NGS sequences #1 (examine NGS sequence files)
  • Check contents of fastq

Practice 3. (120-min) MK

  • Text processing (wc, cut, sed, awk, redirect (> , » ) etc.), gzip
  • To get familiar with NGS sequences #2 (sequential commands by pipe)
  • count how many of reads of fastq

Practice 4. (120-min) MK

  • Shell script (Batch processing)
  • Editor (vim)

Day 4. Thursday 9 9:00 KAWAI, LARAMA

Practice 6. (120-min) GL

  • R statistics and graphics language #1

Practice 7. (120-min) GL

  • R statistics and graphics language #2
  • Goal: * Do (small) analysis and make report of statistical test (apa format)

Lab session 3. (240-min)

Day 5. Friday 10 9:00 KAWAI, LARAMA, ÁVILA

Lecture 6. (120-min) GL

  • Quality control of raw sequence data, basic tools for sequence and NGS data
  • Commands for NGS analyses
  • Commands for sequence analyses
  • Brief review of online resources / web services
  • NGSToolkit (instead of FastQC)

Practice 8. (120-min) MK

  • To get familiar with NGS sequences #3
  • make your own pipeline for custom analysis and record of analysis* screen command* Sequence similarity search against a genome at hand (Do BLAST search locally)

Lab XX (240-min)

Day 6. Monday 13 9:00 ÁVILA , (LARAMA)

Lecture 7 and Practice 9. (240-min)

  • Public sequence database
  • Local mirror of databases
  • Usage of information of public database
  • e.g. Mapping (bowtie2)

Lecture 8 and Practice 10. (240-min)

  • Genome assembly

Day 7. Tuesday 14 9:00 KAWAI

Lecture 9 and Practice 11. (120-min) MK


* Protein-coding gene prediction, RNA-coding gene prediction, gene annotation → MK

  • Muliple sequence alignment
  • protein domain search

Lecture 10 and Practice 12. (120-min) MK

  • Gene contents of genome
  • Concept of ortholog
  • Concept of conserved single-copy genes

Lab session 2. (240-min)

Day 8. Wednsday 15 9:00 - 13:00ABANTO

Lecture 10 and Practice 12. (240-min)

  • Introduction of analyses
  • Pangenome
  • Phylogenetic tree

Lab session 2. (240-min)

Day 9. Thursday 16 9:00 KAWAI, LARAMA

Lab session 2. (240-min)

  • Discuss plans
  • groups work
  • proposal
  • paper introducce
  • apply what they learned

Internal meeting (120-min)

  • namespace/doc-mach/preparation-course-bioinfo2020.txt
  • Last modified: 2020/01/21 07:35
  • by mickey