wiki:SOPs/rna-seq-diff-expressions_TE

Version 2 (modified by twhitfie, 4 years ago) ( diff )

--

Using RNA-Seq to quantify gene levels and assay for differential expression for transposable elements

Background

  • Transposable elements make up between 20 to 80% of the genome sequence for many eukaryotes, yet are typically excluded from the analysis that follows transcriptomic profiling with RNA-seq. This exclusion is due to the repetitive nature of transposons and the ambiguity that accompanies assigning multi-mapping reads.

Step by step analysis

  • Mapping
    • Use STAR or another spliced mapper to map short reads to the genome of choice.
    • See our mapping SOP for more details.
  • Quantification of raw counts
  • Is your sequencing library stranded or unstranded? This information is needed to help these tools accurately count features. Strandedness of some library prep methods:
    • TruSeq Stranded mRNA Kits ("TruSeqStrandedPolyA") reads are reverse stranded (stranded in the reverse direction relative to the transcript orientation).
    • SMART-Seq v4 Ultra Low Input RNA Kit ("SMARTerUltra-lowPOLYA-V4") reads are unstranded.
    • KAPA RNA HyperPrep Kits ("KAPAHyperPrepmRNA") reads are reverse stranded.
    • The Whitehead Genome Core has some more Library Prep Descriptions.
  • See SAMBAMqc (and/or look at mapped reads in a genome browser) to determine or verify strandedness
# single-end reads (unstranded)
featureCounts -a gene_anotations.gtf -o MySample.featureCounts.txt MySample.bam
# single-end reads (forward stranded)
featureCounts -s 1 -a gene_anotations.gtf -o MySample.featureCounts.txt MySample.bam
# single-end reads (reverse stranded)
featureCounts -s 2 -a gene_anotations.gtf -o MySample.featureCounts.txt MySample.bam


# paired-end reads (unstranded)
featureCounts -p -a gene_anotations.gtf -o MySample.featureCounts.txt MySample.bam
# paired-end reads (forward stranded)
featureCounts -p -s 1 -a gene_annotations.gtf -o MySamples.featureCounts.txt *sortedByName.bam
# paired-end reads (reverse stranded)
featureCounts -p -s 2 -a gene_annotations.gtf -o MySamples.featureCounts.txt *sortedByName.bam
Note: See TracWiki for help on using the wiki.