Changes between Version 4 and Version 5 of SOPs/qc_SRA


Ignore:
Timestamp:
08/25/20 11:52:32 (4 years ago)
Author:
gbell
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • SOPs/qc_SRA

    v4 v5  
    5454=== Downloading and processing multiple NCBI SRA samples ===
    5555
    56 To download a list of SRR files (such as for all of the samples of a data series) from NCBI, use prefetch.
     56To download a list of SRR files (such as for all of the samples of a data series) from NCBI, use NCBI's 'prefetch'.
    5757
    58 Given a set of SRA files listed in a single column in the text file "SraAccList.txt" (e.g. SRR7623010, SRR7623011, etc.), the following command will download the entire set:
     58Given a set of SRA files (by SRR ID) listed in a single column in the text file "SraAccList.txt" (e.g. SRR7623010, SRR7623011, etc.), the following command will download the entire set:
    5959
    6060{{{
    61 prefetch -O output_directory SRR_Acc_List.txt
     61prefetch -O output_directory --option-file SRR_Acc_List.txt
    6262}}}
    6363
    6464If you don't specify an output directory, the SRR files will be downloaded to ~/ncbi/ncbi_public/sra (or your configured "Import Path" as described above). 
    6565
    66 To get this list of SRR IDs, go the [[https://www.ncbi.nlm.nih.gov/Traces/study/|SRA Run Selector]] and enter a project accession.  Once on a [[https://www.ncbi.nlm.nih.gov/Traces/study/?acc=SRP000002|project page]], go to the "Select" section and click on "Accession List" to get 'SRR_Acc_List.txt' (or if you want a subset of these, click on "Metadata" in the "Select" section to get a comma separated file, 'SraRunTable.txt')
     66To get this list of SRR IDs, go the [[https://www.ncbi.nlm.nih.gov/Traces/study/|SRA Run Selector]] and enter a project accession.  Once on a [[https://www.ncbi.nlm.nih.gov/Traces/study/?acc=SRP000002|project page]], go to the "Select" section and click on "Accession List" to get 'SRR_Acc_List.txt' (or if you want a subset of these, click on "Metadata" in the "Select" section to get a comma separated file, 'SraRunTable.txt' and create your own 'SRR_Acc_List.txt')
    6767
     68The 'prefetch' command will provide you with a set of SRA files which then need to be converted to fastq.gz.  One way to do this on the set of SRA files is
     69
     70{{{
     71find -name \*.sra -exec bsub fastq-dump --split-3 --gzip {} \;
     72}}}
     73
     74Once you create the fastq.gz files, the *.sra files can be deleted.