Changes between Version 26 and Version 27 of SOPs/variant_calling_GATK
- Timestamp:
- 10/31/25 14:35:16 (3 days ago)
Legend:
- Unmodified
- Added
- Removed
- Modified
-
SOPs/variant_calling_GATK
v26 v27 43 43 May Need "VALIDATION_STRINGENCY=LENIENT" if you get \\ 44 44 Exception in thread "main" net.sf.samtools.SAMFormatException: SAM validation error: ERROR: ... MAPQ should be 0 for unmapped read. \\ 45 * bsub java -jar /usr/local/share/picard-tools/picard.jar MarkDuplicates I=Reads_1.bwa.sorted.bam O=Reads_1.bwa.dedup.bam M=Reads_1.bwa.dedup.txt VALIDATION_STRINGENCY=LENIENT45 * sbatch --job-name=MarkDuplicates --wrap="java -jar /usr/local/share/picard-tools/picard.jar MarkDuplicates I=Reads_1.bwa.sorted.bam O=Reads_1.bwa.dedup.bam M=Reads_1.bwa.dedup.txt VALIDATION_STRINGENCY=LENIENT" 46 46 \\ 47 47 7 - '''Add Read Group header information to each BAM file''' (or GATK won't let you continue) \\ 48 48 Run Picard Tools' [[https://broadinstitute.github.io/picard/command-line-overview.html#AddOrReplaceReadGroups|AddOrReplaceReadGroups]] on each sample. \\ 49 49 Specify RGSM (Read Group sample), RGLB (Read Group Library), RGPL (Read Group platform), and RGPU (Read Group platform unit [e.g. run barcode]) 50 * bsub java -jar /usr/local/share/picard-tools/picard.jar AddOrReplaceReadGroups I=Reads_1.bwa.dedup.bam O=Reads_1.bwa.dedup.good.bam RGSM=My_sample RGLB=My_project RGPL=illumina RGPU=none VALIDATION_STRINGENCY=LENIENT50 * sbatch --job-name=AddRG --wrap="java -jar /usr/local/share/picard-tools/picard.jar AddOrReplaceReadGroups I=Reads_1.bwa.dedup.bam O=Reads_1.bwa.dedup.good.bam RGSM=My_sample RGLB=My_project RGPL=illumina RGPU=none VALIDATION_STRINGENCY=LENIENT" 51 51 \\ 52 52 8 - '''Index BAM file(s)''' with [[http://samtools.sourceforge.net/samtools.shtml|samtools]] (optional; for IGV viewing) 53 * bsub samtools index Reads_1.bwa.dedup.good.bam53 * sbatch --job-name=samtools_index --wrap="samtools index Reads_1.bwa.dedup.good.bam" 54 54 \\ 55 55 9 - '''Run Indel Realignment''' (with [[https://software.broadinstitute.org/gatk/documentation/tooldocs/current/org_broadinstitute_gatk_tools_walkers_indels_RealignerTargetCreator.php|RealignerTargetCreator]] and [[https://software.broadinstitute.org/gatk/documentation/tooldocs/current/org_broadinstitute_gatk_tools_walkers_indels_IndelRealigner.php|IndelRealigner]]) \\ … … 68 68 * Example 4: java -jar GenomeAnalysisTK.jar -T AnalyzeCovariates -R human.fasta -before recal.table -after after_recal.table -plots recal_plots.pdf 69 69 All applied to our sample data: 70 * bsub java -jar /usr/local/gatk3/GenomeAnalysisTK.jar -T BaseRecalibrator -I Reads_1.bwa.dedup.realigned.bam -R /path/to/genome/genome.fa -o Reads_1.bwa.recal_data.txt -knownSites SNPs_from_NCBI.sorted.vcf71 * bsub java -jar /usr/local/gatk3/GenomeAnalysisTK.jar -T PrintReads -I Reads_1.bwa.dedup.realigned.bam -R /path/to/genome/genome.fa -BQSR Reads_1.bwa.recal_data.txt -o Reads_1.bwa.dedup.realigned.recal.bam72 * bsub java -jar /usr/local/gatk3/GenomeAnalysisTK.jar -T BaseRecalibrator -I Reads_1.bwa.dedup.realigned.bam -R /path/to/genome/genome.fa -knownSites SNPs_from_NCBI.sorted.vcf -BQSR Reads_1.bwa.recal_data.txt -o Reads_1.bwa.after_recal.txt73 * bsub java -jar /usr/local/gatk3/GenomeAnalysisTK.jar -T AnalyzeCovariates -R /path/to/genome/genome.fa -before Reads_1.bwa.recal_data.txt -after Reads_1.bwa.after_recal.txt -plots Reads_1.bwa.recal_plots.pdf70 * sbatch --job-name=GATK_BaseRecal --wrap="java -jar /usr/local/gatk3/GenomeAnalysisTK.jar -T BaseRecalibrator -I Reads_1.bwa.dedup.realigned.bam -R /path/to/genome/genome.fa -o Reads_1.bwa.recal_data.txt -knownSites SNPs_from_NCBI.sorted.vcf" 71 * sbatch --job-name=GATK_PrintReads --wrap="java -jar /usr/local/gatk3/GenomeAnalysisTK.jar -T PrintReads -I Reads_1.bwa.dedup.realigned.bam -R /path/to/genome/genome.fa -BQSR Reads_1.bwa.recal_data.txt -o Reads_1.bwa.dedup.realigned.recal.bam" 72 * sbatch --job-name=GATK_BaseRecal2 --wrap="java -jar /usr/local/gatk3/GenomeAnalysisTK.jar -T BaseRecalibrator -I Reads_1.bwa.dedup.realigned.bam -R /path/to/genome/genome.fa -knownSites SNPs_from_NCBI.sorted.vcf -BQSR Reads_1.bwa.recal_data.txt -o Reads_1.bwa.after_recal.txt" 73 * sbatch --job-name=GATK_AnalyzeCov --wrap="java -jar /usr/local/gatk3/GenomeAnalysisTK.jar -T AnalyzeCovariates -R /path/to/genome/genome.fa -before Reads_1.bwa.recal_data.txt -after Reads_1.bwa.after_recal.txt -plots Reads_1.bwa.recal_plots.pdf" 74 74 75 75 \\
