| 19 | | '''Index the reference genome for First Pass.''' |
| 20 | | |
| 21 | | |
| 22 | | |
| 23 | | Create folder, "FirstPass" before running these commands. |
| 24 | | |
| 25 | | To generate genome index files for STAR: |
| 26 | | {{{ |
| 27 | | |
| 28 | | bsub STAR --runMode genomeGenerate --genomeDir /path/to/GenomeDirFirstPass --genomeFastaFiles /path/to/genome/fasta --sjdbGTFfile /path/to/GTF/FileName.gtf --sjdbOverhang 100 --runThreadN 8 |
| 29 | | }}} |
| 30 | | |
| 31 | | |
| 32 | | To map your reads: |
| 33 | | '''Run this command within the FirstPass directory''' |
| 34 | | {{{ |
| 35 | | bsub STAR --genomeDir /path/to/GenomeDirFirstPass --readFilesIn /path/to/Reads_1.fastq --outFileNamePrefix whateverPrefix --runThreadN 8 |
| 36 | | }}} |
| 37 | | |
| 38 | | '''Index the reference genome for Second Pass.''' |
| 39 | | Create folder, "SecondPass" before running these commands. |
| 40 | | |
| 41 | | To generate the 2nd pass genome index files for STAR: |
| 42 | | {{{ |
| 43 | | |
| 44 | | bsub STAR --runMode genomeGenerate --genomeDir /path/to/GenomeDirSecondPass --genomeFastaFiles /path/to/genome/fasta --sjdbFileChrStartEnd /path/to/first/pass/directory/SJ.out.tab --sjdbGTFfile /path/to/GTF/FileName.gtf --sjdbOverhang 100 --runThreadN 8 |
| 45 | | }}} |
| 46 | | |
| 47 | | To map your reads: |
| 48 | | '''Run this command within the SecondPass directory''' |
| 49 | | {{{ |
| 50 | | Input format: fastq ; output format: SAM |
| 51 | | bsub STAR --genomeDir /path/to/GenomeDirSecondPass --readFilesIn /path/to/Reads_1.fastq --outFileNamePrefix whateverPrefix --runThreadN 8 |
| 52 | | }}} |
| 53 | | |
| 54 | | |
| 55 | | The parameters included in the above sample commands are: |
| 56 | | * '''--sjdbOverhang ''' Specifies the length of the genomic sequence around the annotated junction to be used in constructing the splice junctions database. For short reads (<50) use readLength - 1, otherwise a generic value of 100 will work as well (see manual for more info). |
| 57 | | * '''--sjdbGTFfile <GTF_file.gtf>''' Supplies STAR with a GTF file during the genomeGenerate step. Combined with the --sjdbScore <n> option during mapping, this will bias the alignment toward annotated junctions, and reduces alignment to pseudogenes. |
| 58 | | * '''--runMode <alignReads, genomeGenerate>''' "alignReads" does the actual mapping. "genomeGenerate" generates the genomeDir required for mapping (default = alignReads). |
| 59 | | * '''--genomeDir </path/to/GenomeDir>''' Specifies the path to the directory used for storing the genome information created in the genomeGenerate step. |
| 60 | | * '''--genomeFastaFiles <genome FASTA files>''' Specifies genome FASTA files to be used. |
| 61 | | * '''--sjdbFileChrStartEnd <output from first pass> ''' path to the file with genomic coordinates for introns |
| 62 | | * '''--readFilesIn <Reads_1.fastq read2.fastq> ''' Specifies the fastq files containing the reads, can be single-end or paired-end. |
| 63 | | * '''--runThreadN <n> ''' Specifies the number of threads to use. |
| 64 | | |
| | 20 | For the STAR 2-pass mapping procedure, please see our mapping SOP, under STAR. |