Changes between Version 2 and Version 3 of SOPs/variant_calling_GATK
- Timestamp:
- 01/16/14 15:47:47 (11 years ago)
Legend:
- Unmodified
- Added
- Removed
- Modified
-
SOPs/variant_calling_GATK
v2 v3 10 10 export PATH=/usr/lib/jvm/java-7-openjdk-amd64/bin:$PATH 11 11 \\ \\ 12 '''Index the reference genome.''' [Need to do just once .]12 '''Index the reference genome.''' [Need to do just once, with [[http://samtools.sourceforge.net/samtools.shtml|samtools]].] 13 13 * samtools faidx /path/to/genome/genome.fa 14 14 \\ 15 '''Create a genome dictionary.''' [Need to do just once .]15 '''Create a genome dictionary.''' [Need to do just once, with Picard's [[http://picard.sourceforge.net/command-line-overview.shtml#CreateSequenceDictionary|CreateSequenceDictionary]].] 16 16 * java -jar /usr/local/share/picard-tools/CreateSequenceDictionary.jar R=/path/to/genome/genome.fa O=/path/to/genome/genome.dict 17 17 \\ 18 '''Validate VCF file or known variants''' (with GATK's ValidateVariants)18 '''Validate VCF file or known variants''' (with GATK's [[http://www.broadinstitute.org/gatk/gatkdocs/org_broadinstitute_sting_gatk_walkers_variantutils_ValidateVariants.html|ValidateVariants]]) 19 19 * java -jar /usr/local/gatk/GenomeAnalysisTK.jar -T ValidateVariants -R /path/to/genome/genome.fa --variant:VCF SNPs_from_NCBI.sorted.vcf \\ 20 20 Respond to errors (by correcting or removing problematic variants), run command again, etc., until validation is successful. \\ … … 22 22 23 23 \\ 24 '''Align reads to genome with bwa'''24 '''Align reads to genome with [[http://bio-bwa.sourceforge.net/bwa.shtml|bwa]]''' 25 25 * bsub "bwa aln /path/to/genome/bwa/genome Reads_1.fq > Reads_1.sai" 26 26 * bsub "bwa samse /path/to/genome/bwa/genome Reads_1.sai Reads_1.fq > Reads_1.bwa.sam" 27 27 \\ 28 '''Convert SAM to BAM, sort, and index''' 28 '''Convert SAM to BAM, sort, and index''' with [[http://samtools.sourceforge.net/samtools.shtml|samtools]] 29 29 * bsub /nfs/BaRC_Public/BaRC_code/Perl/SAM_to_BAM_sort_index/SAM_to_BAM_sort_index.pl Reads_1.bwa.sam 30 30 \\ 31 31 '''Mark duplicates''' (multiple identical reads mapped to the same location). \\ 32 Run Picard Tools' MarkDuplicateson each sample \\32 Run Picard Tools' [[http://picard.sourceforge.net/command-line-overview.shtml#MarkDuplicates|MarkDuplicates]] on each sample \\ 33 33 May Need "VALIDATION_STRINGENCY=LENIENT" if you get \\ 34 34 Exception in thread "main" net.sf.samtools.SAMFormatException: SAM validation error: ERROR: ... MAPQ should be 0 for unmapped read. \\ … … 36 36 \\ 37 37 '''Add Read Group header information to each BAM file''' (or GATK won't let you continue) \\ 38 Run Picard Tools' [[http://picard.sourceforge.net/command-line-overview.shtml#AddOrReplaceReadGroups|AddOrReplaceReadGroups] on each sample. \\38 Run Picard Tools' [[http://picard.sourceforge.net/command-line-overview.shtml#AddOrReplaceReadGroups|AddOrReplaceReadGroups]] on each sample. \\ 39 39 Specify RGSM (Read Group sample), RGLB (Read Group Library), RGPL (Read Group platform), and RGPU (Read Group platform unit [e.g. run barcode]) 40 40 * bsub java -jar /usr/local/share/picard-tools/AddOrReplaceReadGroups.jar I=Reads_1.bwa.dedup.bam O=Reads_1.bwa.dedup.good.bam RGSM=My_sample RGLB=My_project RGPL=illumina RGPU=none VALIDATION_STRINGENCY=LENIENT 41 41 \\ 42 '''Index BAM file(s)''' (optional; for IGV viewing)42 '''Index BAM file(s)''' with [[http://samtools.sourceforge.net/samtools.shtml|samtools]] (optional; for IGV viewing) 43 43 * bsub samtools index Reads_1.bwa.dedup.good.bam 44 44 \\