Changes between Version 5 and Version 6 of SOPs/qc_shortReads


Ignore:
Timestamp:
11/19/13 10:01:56 (11 years ago)
Author:
gbell
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • SOPs/qc_shortReads

    v5 v6  
    1515
    1616It's installed on tak and LSF and can be run from the command line
    17   * Sample command: ''**fastqc s_1_sequence.txt s_2_sequence.txt**''
     17  * Sample command 1 (fastq inputs): ''**fastqc s_1_sequence.txt s_2_sequence.txt**''
     18  * Sample command 2 (fastq.gz inputs): ''**fastqc s_1_sequence.txt.gz s_2_sequence.txt.gz**''
    1819or interactively (with X Windows):
    1920  * ''**fastqc**''
     
    2829
    2930{{{
    30    bsub “perl /nfs/BaRC_Public/BaRC_code/Perl/cmpfastq/cmpfastq.pl s_8_1_sequence.txt s_8_2_sequence.txt”
     31   bsub “perl /nfs/BaRC_Public/BaRC_code/Perl/cmpfastq/cmpfastq.pl s_8_1_sequence.txt s_8_2_sequence.txt”  # fastq inputs
     32}}}
     33{{{
     34   bsub “perl /nfs/BaRC_Public/BaRC_code/Perl/cmpfastq/cmpfastqgz.pl s_8_1_sequence.txt.gz s_8_2_sequence.txt.gz”  # fastq.gz inputs
    3135}}}
    3236
     
    8690{{{
    8791   # Sample commands:
    88    # quality_stats: Sampl Solexa reads file: s_1_1_sequence.txt
     92   # quality_stats: Sample Solexa reads file: s_1_1_sequence.txt or s_1_1_sequence.txt.gz
    8993   fastx_quality_stats -i s_1_1_sequence.txt -o s_1_1_sequence.stats
     94   gunzip -c s_1_1_sequence.txt.gz | fastx_quality_stats -o s_1_1_sequence.stats
    9095   # a figure for Nucleotide Distribution:
    9196   fastx_nucleotide_distribution_graph.sh -i s_1_1_sequence.stats  -o s_1_1_sequence.stats.nuc.png -t "s_1_1_sequence.stats Nucleotide Distribution"
     
    105110{{{
    106111  fastq_quality_filter -v -q 20 -p 75 -i myFile.fastq -o myFile.fastq.fastx_trim   
    107            version 0.0.6
     112  gunzip -c myFile.fastq | fastq_quality_filter -v -q 20 -p 75 -o myFile.fastq.fastx_trim
     113
    108114           [-h]         = This helpful help screen.
    109115           [-q N]       = Minimum quality score to keep.
     
    127133{{{
    128134bsub "fastq_quality_trimmer -v -t 20 -l 25 -i input.fastq -o output.fastq"
    129      
     135bsub "gunzip -c input.fastq.gz | fastq_quality_trimmer -v -t 20 -l 25 -z -o output.fastq.gz"   
     136
    130137   [-t N]       = Quality threshold - nucleotides with lower
    131138                  quality will be trimmed (from the end of the sequence).
     
    158165{{{
    159166bsub "fastx_clipper -a CTGTAGGCACCATCAAT -i s2_sequence.txt -v -l 22 -o s2_sequence_noLinker.txt"
     167bsub "gunzip -c s2_sequence.txt | fastx_clipper -a CTGTAGGCACCATCAAT -v -l 22 -z -o s2_sequence_noLinker.txt.gz"
     168
     169
    160170In the above command:
    161171   -a CTGTAGGCACCATCAAT is the linker sequence
     
    182192{{{
    183193bsub "fastx_trimmer -f 1 -l 22  -i s7_sequence_clipped.txt -o s7_sequence_clipped_trimmed.txt"
     194bsub "gunzip -c s7_sequence_clipped.txt | fastx_trimmer -f 1 -l 22 -z -o s7_sequence_clipped_trimmed.txt.gz"
    184195     
    185196[-i INFILE]  = FASTA/Q input file. default is STDIN.
     
    206217
    207218{{{
    208 /nfs/BaRC_Public/BaRC_code/Perl/cmpfastq/cmpfastq.pl sequence.1_1.filt.txt sequence.1_2.filt.txt
     219/nfs/BaRC_Public/BaRC_code/Perl/cmpfastq/cmpfastq.pl sequence.1_1.filt.txt sequence.1_2.filt.txt  # fastq inputs
     220/nfs/BaRC_Public/BaRC_code/Perl/cmpfastq/cmpfastqgz.pl sequence.1_1.filt.txt.gz sequence.1_2.filt.txt.gz  # fastq.gz inputs
    209221}}}
    210222