Changes between Version 5 and Version 6 of SOPs/qc_shortReads
- Timestamp:
- 11/19/13 10:01:56 (11 years ago)
Legend:
- Unmodified
- Added
- Removed
- Modified
-
SOPs/qc_shortReads
v5 v6 15 15 16 16 It's installed on tak and LSF and can be run from the command line 17 * Sample command: ''**fastqc s_1_sequence.txt s_2_sequence.txt**'' 17 * Sample command 1 (fastq inputs): ''**fastqc s_1_sequence.txt s_2_sequence.txt**'' 18 * Sample command 2 (fastq.gz inputs): ''**fastqc s_1_sequence.txt.gz s_2_sequence.txt.gz**'' 18 19 or interactively (with X Windows): 19 20 * ''**fastqc**'' … … 28 29 29 30 {{{ 30 bsub “perl /nfs/BaRC_Public/BaRC_code/Perl/cmpfastq/cmpfastq.pl s_8_1_sequence.txt s_8_2_sequence.txt” 31 bsub “perl /nfs/BaRC_Public/BaRC_code/Perl/cmpfastq/cmpfastq.pl s_8_1_sequence.txt s_8_2_sequence.txt” # fastq inputs 32 }}} 33 {{{ 34 bsub “perl /nfs/BaRC_Public/BaRC_code/Perl/cmpfastq/cmpfastqgz.pl s_8_1_sequence.txt.gz s_8_2_sequence.txt.gz” # fastq.gz inputs 31 35 }}} 32 36 … … 86 90 {{{ 87 91 # Sample commands: 88 # quality_stats: Sampl Solexa reads file: s_1_1_sequence.txt92 # quality_stats: Sample Solexa reads file: s_1_1_sequence.txt or s_1_1_sequence.txt.gz 89 93 fastx_quality_stats -i s_1_1_sequence.txt -o s_1_1_sequence.stats 94 gunzip -c s_1_1_sequence.txt.gz | fastx_quality_stats -o s_1_1_sequence.stats 90 95 # a figure for Nucleotide Distribution: 91 96 fastx_nucleotide_distribution_graph.sh -i s_1_1_sequence.stats -o s_1_1_sequence.stats.nuc.png -t "s_1_1_sequence.stats Nucleotide Distribution" … … 105 110 {{{ 106 111 fastq_quality_filter -v -q 20 -p 75 -i myFile.fastq -o myFile.fastq.fastx_trim 107 version 0.0.6 112 gunzip -c myFile.fastq | fastq_quality_filter -v -q 20 -p 75 -o myFile.fastq.fastx_trim 113 108 114 [-h] = This helpful help screen. 109 115 [-q N] = Minimum quality score to keep. … … 127 133 {{{ 128 134 bsub "fastq_quality_trimmer -v -t 20 -l 25 -i input.fastq -o output.fastq" 129 135 bsub "gunzip -c input.fastq.gz | fastq_quality_trimmer -v -t 20 -l 25 -z -o output.fastq.gz" 136 130 137 [-t N] = Quality threshold - nucleotides with lower 131 138 quality will be trimmed (from the end of the sequence). … … 158 165 {{{ 159 166 bsub "fastx_clipper -a CTGTAGGCACCATCAAT -i s2_sequence.txt -v -l 22 -o s2_sequence_noLinker.txt" 167 bsub "gunzip -c s2_sequence.txt | fastx_clipper -a CTGTAGGCACCATCAAT -v -l 22 -z -o s2_sequence_noLinker.txt.gz" 168 169 160 170 In the above command: 161 171 -a CTGTAGGCACCATCAAT is the linker sequence … … 182 192 {{{ 183 193 bsub "fastx_trimmer -f 1 -l 22 -i s7_sequence_clipped.txt -o s7_sequence_clipped_trimmed.txt" 194 bsub "gunzip -c s7_sequence_clipped.txt | fastx_trimmer -f 1 -l 22 -z -o s7_sequence_clipped_trimmed.txt.gz" 184 195 185 196 [-i INFILE] = FASTA/Q input file. default is STDIN. … … 206 217 207 218 {{{ 208 /nfs/BaRC_Public/BaRC_code/Perl/cmpfastq/cmpfastq.pl sequence.1_1.filt.txt sequence.1_2.filt.txt 219 /nfs/BaRC_Public/BaRC_code/Perl/cmpfastq/cmpfastq.pl sequence.1_1.filt.txt sequence.1_2.filt.txt # fastq inputs 220 /nfs/BaRC_Public/BaRC_code/Perl/cmpfastq/cmpfastqgz.pl sequence.1_1.filt.txt.gz sequence.1_2.filt.txt.gz # fastq.gz inputs 209 221 }}} 210 222