Changes between Version 1 and Version 2 of SOP/Calling
- Timestamp:
- 10/31/25 14:45:31 (3 days ago)
Legend:
- Unmodified
- Added
- Removed
- Modified
-
SOP/Calling
v1 v2 33 33 \\ 34 34 4 - '''Align reads to genome with [[http://bio-bwa.sourceforge.net/bwa.shtml|bwa]]''' 35 * bsub"bwa aln /path/to/genome/bwa/genome Reads_1.fq > Reads_1.sai"36 * bsub"bwa samse /path/to/genome/bwa/genome Reads_1.sai Reads_1.fq > Reads_1.bwa.sam"35 * sbatch --job-name=bwa_aln_1 --wrap="bwa aln /path/to/genome/bwa/genome Reads_1.fq > Reads_1.sai" 36 * sbatch --job-name=bwa_samse_1 --wrap="bwa samse /path/to/genome/bwa/genome Reads_1.sai Reads_1.fq > Reads_1.bwa.sam" 37 37 \\ 38 38 5 - '''Convert SAM to BAM, sort, and index''' with BaRC's streamlined [[http://samtools.sourceforge.net/samtools.shtml|samtools]] commands 39 * bsub /nfs/BaRC_Public/BaRC_code/Perl/SAM_to_BAM_sort_index/SAM_to_BAM_sort_index.pl Reads_1.bwa.sam39 * sbatch --job-name=SAM2BAM --wrap="/nfs/BaRC_Public/BaRC_code/Perl/SAM_to_BAM_sort_index/SAM_to_BAM_sort_index.pl Reads_1.bwa.sam" 40 40 \\ 41 41 6 - '''Mark duplicates''' (multiple identical reads mapped to the same location) \\ … … 43 43 May Need "VALIDATION_STRINGENCY=LENIENT" if you get \\ 44 44 Exception in thread "main" net.sf.samtools.SAMFormatException: SAM validation error: ERROR: ... MAPQ should be 0 for unmapped read. \\ 45 * bsub java -jar /usr/local/share/picard-tools/picard.jar MarkDuplicates I=Reads_1.bwa.sorted.bam O=Reads_1.bwa.dedup.bam M=Reads_1.bwa.dedup.txt VALIDATION_STRINGENCY=LENIENT45 * sbatch --job-name=MarkDuplicates --wrap="java -jar /usr/local/share/picard-tools/picard.jar MarkDuplicates I=Reads_1.bwa.sorted.bam O=Reads_1.bwa.dedup.bam M=Reads_1.bwa.dedup.txt VALIDATION_STRINGENCY=LENIENT" 46 46 \\ 47 47 7 - '''Add Read Group header information to each BAM file''' (or GATK won't let you continue) \\ 48 48 Run Picard Tools' [[https://broadinstitute.github.io/picard/command-line-overview.html#AddOrReplaceReadGroups|AddOrReplaceReadGroups]] on each sample. \\ 49 49 Specify RGSM (Read Group sample), RGLB (Read Group Library), RGPL (Read Group platform), and RGPU (Read Group platform unit [e.g. run barcode]) 50 * bsub java -jar /usr/local/share/picard-tools/picard.jar AddOrReplaceReadGroups I=Reads_1.bwa.dedup.bam O=Reads_1.bwa.dedup.good.bam RGSM=My_sample RGLB=My_project RGPL=illumina RGPU=none VALIDATION_STRINGENCY=LENIENT50 * sbatch --job-name=AddRG --wrap="java -jar /usr/local/share/picard-tools/picard.jar AddOrReplaceReadGroups I=Reads_1.bwa.dedup.bam O=Reads_1.bwa.dedup.good.bam RGSM=My_sample RGLB=My_project RGPL=illumina RGPU=none VALIDATION_STRINGENCY=LENIENT" 51 51 \\ 52 52 8 - '''Index BAM file(s)''' with [[http://samtools.sourceforge.net/samtools.shtml|samtools]] (optional; for IGV viewing) 53 * bsub samtools index Reads_1.bwa.dedup.good.bam53 * sbatch --job-name=samtools_index --wrap="samtools index Reads_1.bwa.dedup.good.bam" 54 54 \\ 55 55 9 - '''Run Indel Realignment''' (with [[https://software.broadinstitute.org/gatk/documentation/tooldocs/current/org_broadinstitute_gatk_tools_walkers_indels_RealignerTargetCreator.php|RealignerTargetCreator]] and [[https://software.broadinstitute.org/gatk/documentation/tooldocs/current/org_broadinstitute_gatk_tools_walkers_indels_IndelRealigner.php|IndelRealigner]]) \\ … … 68 68 * Example 4: java -jar GenomeAnalysisTK.jar -T AnalyzeCovariates -R human.fasta -before recal.table -after after_recal.table -plots recal_plots.pdf 69 69 All applied to our sample data: 70 * bsub java -jar /usr/local/gatk3/GenomeAnalysisTK.jar -T BaseRecalibrator -I Reads_1.bwa.dedup.realigned.bam -R /path/to/genome/genome.fa -o Reads_1.bwa.recal_data.txt -knownSites SNPs_from_NCBI.sorted.vcf71 * bsub java -jar /usr/local/gatk3/GenomeAnalysisTK.jar -T PrintReads -I Reads_1.bwa.dedup.realigned.bam -R /path/to/genome/genome.fa -BQSR Reads_1.bwa.recal_data.txt -o Reads_1.bwa.dedup.realigned.recal.bam72 * bsub java -jar /usr/local/gatk3/GenomeAnalysisTK.jar -T BaseRecalibrator -I Reads_1.bwa.dedup.realigned.bam -R /path/to/genome/genome.fa -knownSites SNPs_from_NCBI.sorted.vcf -BQSR Reads_1.bwa.recal_data.txt -o Reads_1.bwa.after_recal.txt73 * bsub java -jar /usr/local/gatk3/GenomeAnalysisTK.jar -T AnalyzeCovariates -R /path/to/genome/genome.fa -before Reads_1.bwa.recal_data.txt -after Reads_1.bwa.after_recal.txt -plots Reads_1.bwa.recal_plots.pdf70 * sbatch --job-name=GATK_BaseRecal --wrap="java -jar /usr/local/gatk3/GenomeAnalysisTK.jar -T BaseRecalibrator -I Reads_1.bwa.dedup.realigned.bam -R /path/to/genome/genome.fa -o Reads_1.bwa.recal_data.txt -knownSites SNPs_from_NCBI.sorted.vcf" 71 * sbatch --job-name=GATK_PrintReads --wrap="java -jar /usr/local/gatk3/GenomeAnalysisTK.jar -T PrintReads -I Reads_1.bwa.dedup.realigned.bam -R /path/to/genome/genome.fa -BQSR Reads_1.bwa.recal_data.txt -o Reads_1.bwa.dedup.realigned.recal.bam" 72 * sbatch --job-name=GATK_BaseRecal2 --wrap="java -jar /usr/local/gatk3/GenomeAnalysisTK.jar -T BaseRecalibrator -I Reads_1.bwa.dedup.realigned.bam -R /path/to/genome/genome.fa -knownSites SNPs_from_NCBI.sorted.vcf -BQSR Reads_1.bwa.recal_data.txt -o Reads_1.bwa.after_recal.txt" 73 * sbatch --job-name=GATK_AnalyzeCov --wrap="java -jar /usr/local/gatk3/GenomeAnalysisTK.jar -T AnalyzeCovariates -R /path/to/genome/genome.fa -before Reads_1.bwa.recal_data.txt -after Reads_1.bwa.after_recal.txt -plots Reads_1.bwa.recal_plots.pdf" 74 74 75 75 \\
