19 | | '''Index the reference genome for First Pass.''' |
20 | | |
21 | | |
22 | | |
23 | | Create folder, "FirstPass" before running these commands. |
24 | | |
25 | | To generate genome index files for STAR: |
26 | | {{{ |
27 | | |
28 | | bsub STAR --runMode genomeGenerate --genomeDir /path/to/GenomeDirFirstPass --genomeFastaFiles /path/to/genome/fasta --sjdbGTFfile /path/to/GTF/FileName.gtf --sjdbOverhang 100 --runThreadN 8 |
29 | | }}} |
30 | | |
31 | | |
32 | | To map your reads: |
33 | | '''Run this command within the FirstPass directory''' |
34 | | {{{ |
35 | | bsub STAR --genomeDir /path/to/GenomeDirFirstPass --readFilesIn /path/to/Reads_1.fastq --outFileNamePrefix whateverPrefix --runThreadN 8 |
36 | | }}} |
37 | | |
38 | | '''Index the reference genome for Second Pass.''' |
39 | | Create folder, "SecondPass" before running these commands. |
40 | | |
41 | | To generate the 2nd pass genome index files for STAR: |
42 | | {{{ |
43 | | |
44 | | bsub STAR --runMode genomeGenerate --genomeDir /path/to/GenomeDirSecondPass --genomeFastaFiles /path/to/genome/fasta --sjdbFileChrStartEnd /path/to/first/pass/directory/SJ.out.tab --sjdbGTFfile /path/to/GTF/FileName.gtf --sjdbOverhang 100 --runThreadN 8 |
45 | | }}} |
46 | | |
47 | | To map your reads: |
48 | | '''Run this command within the SecondPass directory''' |
49 | | {{{ |
50 | | Input format: fastq ; output format: SAM |
51 | | bsub STAR --genomeDir /path/to/GenomeDirSecondPass --readFilesIn /path/to/Reads_1.fastq --outFileNamePrefix whateverPrefix --runThreadN 8 |
52 | | }}} |
53 | | |
54 | | |
55 | | The parameters included in the above sample commands are: |
56 | | * '''--sjdbOverhang ''' Specifies the length of the genomic sequence around the annotated junction to be used in constructing the splice junctions database. For short reads (<50) use readLength - 1, otherwise a generic value of 100 will work as well (see manual for more info). |
57 | | * '''--sjdbGTFfile <GTF_file.gtf>''' Supplies STAR with a GTF file during the genomeGenerate step. Combined with the --sjdbScore <n> option during mapping, this will bias the alignment toward annotated junctions, and reduces alignment to pseudogenes. |
58 | | * '''--runMode <alignReads, genomeGenerate>''' "alignReads" does the actual mapping. "genomeGenerate" generates the genomeDir required for mapping (default = alignReads). |
59 | | * '''--genomeDir </path/to/GenomeDir>''' Specifies the path to the directory used for storing the genome information created in the genomeGenerate step. |
60 | | * '''--genomeFastaFiles <genome FASTA files>''' Specifies genome FASTA files to be used. |
61 | | * '''--sjdbFileChrStartEnd <output from first pass> ''' path to the file with genomic coordinates for introns |
62 | | * '''--readFilesIn <Reads_1.fastq read2.fastq> ''' Specifies the fastq files containing the reads, can be single-end or paired-end. |
63 | | * '''--runThreadN <n> ''' Specifies the number of threads to use. |
64 | | |
| 20 | For the STAR 2-pass mapping procedure, please see our mapping SOP, under STAR. |