= Manipulating VCF files = Create a VCF ([http://en.wikipedia.org/wiki/Variant_Call_Format variant call format]) file [with about any program that identifies variants], such as * samtools' mpileup+bcftools: {{{ # One file of mapped reads samtools mpileup -uf indexed_genome My_mapped_reads.bam | bcftools view -bvcg - >| My_mapped_reads.raw.bcf # Multiple files of mapped reads samtools mpileup -uf indexed_genome *.bam | bcftools view -bvcg - >| Multiple_samples.raw.bcf }}} Convert from BCF (binary version of VCF) to VCF: {{{ bcftools view My_mapped_reads.raw.bcf > My_mapped_reads.raw.vcf }}} Convert from VCF to BCF: {{{ bcftools view -bS -D chr_list.txt My_mapped_reads.raw.vcf > My_mapped_reads.raw.bcf }}} Merge multiple VCF files -- works on raw VCF files but apparently not with those processed by vcf-annotate {{{ # For each VCF file: bgzip Variants_sample_A.raw.vcf tabix -p vcf Variants_sample_A.raw.vcf.gz }}} Merge multiple bgzipped, tabixed files: {{{ vcf-merge *.raw.vcf.gz >| Variants_all_samples.raw.vcf }}} Annotate a VCF file (applying all filters with default values): {{{ cat Variants_all_samples.raw.vcf | vcf-annotate -f + > Variants_all_samples.withTags.vcf }}} Sort by chromosome and then coordinates {{{ vcf-sort Variants.vcf > Variants.sorted.vcf }}} Validate VCF file (for use with GATK, for example) {{{ java -jar /usr/local/gatk/GenomeAnalysisTK.jar -T ValidateVariants -R /path/to/indexed/genome --variant:VCF SNPs.vcf }}}