Changes between Version 2 and Version 3 of SOPs/variant_calling


Ignore:
Timestamp:
10/15/13 14:40:17 (12 years ago)
Author:
gbell
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • SOPs/variant_calling

    v2 v3  
    55* mapping short reads
    66* calling raw variants
    7 * filtering (really more like 'tagging') variants
    8 * annotating variants
     7* adding filters (really more like 'tagging') and some annotations to variants
     8* annotating effect(s) of variants on genes
    99
    1010== Map short reads ==
     
    4343}}}
    4444
    45 == Call raw variants ==
     45== Call raw variants with mpileup+bcftools ==
     46
     47Call variants (one sample vs. reference) with samtools' mpileup+bcftools (see the [http://samtools.sourceforge.net/mpileup.shtml| samtools' variant calling page] for more details)
     48{{{
     49samtools mpileup -d100000 -uf /nfs/genomes/sgd_2010/bwa/sacCer3.fa A_reads.bt2.sorted_unique.bam | bcftools view -bvcg - >| A_reads.bt2.sorted_unique.raw.bcf
     50}}}
     51Call variants (multiple sample vs. reference) using a set of BAM files
     52{{{
     53samtools mpileup -d100000 -uf /nfs/genomes/sgd_2010/bwa/sacCer3.fa *_reads.bt2.sorted_unique.bam | bcftools view -bvcg - >| ALL_reads.bt2.sorted_unique.raw.bcf
     54}}}
     55
     56== Add filters and annotations to raw variants ==
     57
     58This step uses vcf-annotate from the [http://vcftools.sourceforge.net/docs.html VCFtools suite]
     59
     60Annotate variants by adding tags ("filters" but all variants are kept) to each variant, using all default filters
     61{{{
     62bcftools view -L -vcg A_reads.bt2.sorted_unique.raw.bcf | vcf-annotate -f + > A_reads.bt2.sorted_unique.withTags.bcf
     63}}}
     64
     65Prepare file of known SNPs for use with vcf-annotate.
     66Start with tab-delimited file (ex: SNP137.bed) that looks like
     67
     68chr1    1360    1361    rs000000001
     69{{{
     70bgzip SNP137.bed
     71tabix -p bed SNP137.bed.gz
     72}}}
     73
     74Annotate variants by adding tags, more analysis, and any SNPdb overlaps
     75{{{
     76bcftools view -L -vcg A_reads.bt2.sorted_unique.raw.bcf | vcf-annotate -f +/d=10 --fill-HWE --fill-type -n -a SNP137.bed.gz -c CHROM,FROM,TO,INFO/SNP_ID -d key=INFO,ID=SNP_ID,Number=1,Type=Integer,Description='SNP137 sites' > A_reads.bt2.sorted_unique.filtered.vcf
     77}}}
     78
     79Any tags will appear in the FILTER field, and SNPdb overlaps will appear in the INFO field of the output VCF file.
     80
     81Overlap with annotations can also be identified with intersectBed (where annotation will appear in new fields of the output VCF file):
     82{{{
     83intersectBed -wao -split -a A_reads.bt2.sorted_unique.raw.vcf -b SNP137.bed > A_reads.bt2.sorted_unique.annotated.vcf
     84}}}
     85
     86== Annotate effect(s) of variants on genes ==
     87
     88Use snpEff (assuming snpEff has gene+protein annotations for your genome):
     89{{{
     90snpEff -c /usr/local/share/snpEff/snpEff.config -s A_snpEff.html SacCer_Apr2011.18 A_reads.bt2.sorted_unique.filtered.vcf > A_reads.bt2.sorted_unique.filtered.snpEff.vcf
     91}}}
     92
     93Get information only for variants that overlap protein-coding exons:
     94{{{
     95snpEff -c /usr/local/share/snpEff/snpEff.config -no-downstream -no-intergenic -no-intron -no-upstream -no-utr -s A_snpEff.html SacCer_Apr2011.18 A_reads.bt2.sorted_unique.filtered.vcf > A_reads.bt2.sorted_unique.filtered.snpEff.vcf
     96}}}