Changes between Version 2 and Version 3 of SOPs/variant_calling_GATK


Ignore:
Timestamp:
01/16/14 15:47:47 (11 years ago)
Author:
gbell
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • SOPs/variant_calling_GATK

    v2 v3  
    1010export PATH=/usr/lib/jvm/java-7-openjdk-amd64/bin:$PATH
    1111\\ \\
    12 '''Index the reference genome.''' [Need to do just once.]
     12'''Index the reference genome.''' [Need to do just once, with [[http://samtools.sourceforge.net/samtools.shtml|samtools]].]
    1313  * samtools faidx /path/to/genome/genome.fa
    1414\\
    15 '''Create a genome dictionary.''' [Need to do just once.]
     15'''Create a genome dictionary.''' [Need to do just once, with Picard's [[http://picard.sourceforge.net/command-line-overview.shtml#CreateSequenceDictionary|CreateSequenceDictionary]].]
    1616  * java -jar /usr/local/share/picard-tools/CreateSequenceDictionary.jar R=/path/to/genome/genome.fa O=/path/to/genome/genome.dict
    1717\\
    18 '''Validate VCF file or known variants''' (with GATK's ValidateVariants)
     18'''Validate VCF file or known variants''' (with GATK's [[http://www.broadinstitute.org/gatk/gatkdocs/org_broadinstitute_sting_gatk_walkers_variantutils_ValidateVariants.html|ValidateVariants]])
    1919  * java -jar /usr/local/gatk/GenomeAnalysisTK.jar -T ValidateVariants -R /path/to/genome/genome.fa --variant:VCF SNPs_from_NCBI.sorted.vcf \\
    2020Respond to errors (by correcting or removing problematic variants), run command again, etc., until validation is successful. \\
     
    2222
    2323\\
    24 '''Align reads to genome with bwa'''
     24'''Align reads to genome with [[http://bio-bwa.sourceforge.net/bwa.shtml|bwa]]'''
    2525  * bsub "bwa aln /path/to/genome/bwa/genome Reads_1.fq > Reads_1.sai"
    2626  * bsub "bwa samse /path/to/genome/bwa/genome Reads_1.sai  Reads_1.fq > Reads_1.bwa.sam"
    2727\\
    28 '''Convert SAM to BAM, sort, and index'''
     28'''Convert SAM to BAM, sort, and index''' with [[http://samtools.sourceforge.net/samtools.shtml|samtools]]
    2929  * bsub /nfs/BaRC_Public/BaRC_code/Perl/SAM_to_BAM_sort_index/SAM_to_BAM_sort_index.pl Reads_1.bwa.sam
    3030\\
    3131'''Mark duplicates''' (multiple identical reads mapped to the same location). \\
    32 Run Picard Tools' MarkDuplicates on each sample \\
     32Run Picard Tools' [[http://picard.sourceforge.net/command-line-overview.shtml#MarkDuplicates|MarkDuplicates]] on each sample \\
    3333May Need "VALIDATION_STRINGENCY=LENIENT" if you get  \\
    3434Exception in thread "main" net.sf.samtools.SAMFormatException: SAM validation error: ERROR: ... MAPQ should be 0 for unmapped read. \\
     
    3636\\
    3737'''Add Read Group header information to each BAM file''' (or GATK won't let you continue) \\
    38 Run Picard Tools' [[http://picard.sourceforge.net/command-line-overview.shtml#AddOrReplaceReadGroups|AddOrReplaceReadGroups] on each sample. \\
     38Run Picard Tools' [[http://picard.sourceforge.net/command-line-overview.shtml#AddOrReplaceReadGroups|AddOrReplaceReadGroups]] on each sample. \\
    3939Specify RGSM (Read Group sample), RGLB (Read Group Library), RGPL (Read Group platform), and RGPU (Read Group platform unit [e.g. run barcode])
    4040  * bsub java -jar /usr/local/share/picard-tools/AddOrReplaceReadGroups.jar I=Reads_1.bwa.dedup.bam O=Reads_1.bwa.dedup.good.bam RGSM=My_sample RGLB=My_project RGPL=illumina RGPU=none VALIDATION_STRINGENCY=LENIENT
    4141\\
    42 '''Index BAM file(s)''' (optional; for IGV viewing)
     42'''Index BAM file(s)''' with [[http://samtools.sourceforge.net/samtools.shtml|samtools]] (optional; for IGV viewing)
    4343  * bsub samtools index Reads_1.bwa.dedup.good.bam
    4444\\