Context Navigation

Changes between Version 25 and Version 26 of SOPs/mapping

-              v25
+              v26
+== Preprocessing read files from NCBI SRA ==
+**SRA** (for Sequence Read Archive) is a NCBI binary format for short reads.
+It's thoroughly described in the [[http://www.ncbi.nlm.nih.gov/books/NBK47528/|SRA Handbook]]
+Processing SRA files requires the [[https://tak.wi.mit.edu/trac/wiki/sra-toolkit|NCBI SRA Toolkit]], which is installed on our systems.
+The main command is **fastq-dump <SRA archive file>**, like
+''**fastq-dump SRR060751.sra**''
+If your reads are paired, by default the #1 and #2 reads will end up concatenated together in the same file.  To get them into separate files, instead use a command like
+''**fastq-dump --split-files SRR060751.sra**''
+See [[http://www.ncbi.nlm.nih.gov/books/NBK47540/#SRA_Download_Guid_B.5_Converting_SRA_for|Converting SRA format data into FASTQ]] for all program options.
+Note that a fastq file is about 4-5x larger than its corresponding SRA file.
 == Mapping short reads ==