Changes between Version 48 and Version 49 of SOPs/rna-seq-diff-expressions


Ignore:
Timestamp:
06/20/17 12:16:35 (8 years ago)
Author:
gbell
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • SOPs/rna-seq-diff-expressions

    v48 v49  
    8080
    8181    * Method 1: Use [http://cole-trapnell-lab.github.io/cufflinks/manual/ cufflinks]
    82       * This is the traditional method.
    83 
    84       * To quantify transcripts and genes in a GTF file, use a command like
     82        * This is the traditional method.
     83        * To quantify transcripts and genes in a GTF file, use a command like
    8584{{{
    8685bsub cufflinks -G gene_models.gtf accepted_hits.bam
    8786}}}
    88 
    89       * To quantify transcripts, potentially novel, annotated by cufflinks, use a command like
     87        * To quantify transcripts, potentially novel, annotated by cufflinks, use a command like
    9088{{{
    9189bsub cufflinks accepted_hits.bam
    9290}}}
    93 
    94       * Gene-level FPKM values are calculated by taking the sum of all transcript FPKMs for a gene.  As a result, no "gene length" needs to be calculated.
    95       * NOTE: Some genes, although present in a GTF annotation file, may not get quantified by cufflinks.  This occurs for genes found in very long regions of overlapping genes (which exceed the default value for --max-bundle-length).  When this occurs, the standard err output of cufflinks (contained in the long LSF email when cufflinks is run via 'bsub') will contain the message "Warning: Skipping large bundle."  To correct this (or prevent it in the first place), add an argument like '--max-bundle-length 10000000' to the cufflinks command. You may want to compare the list of genes in the GTF file to that of the cufflinks output to verify that they match.
    96       * If you only want to quantify genes in your GTF file use the -G option (instead of -g which will give also transcripts found by Cufflinks and will take away counts from transcripts in your gtf file).
     91        * Gene-level FPKM values are calculated by taking the sum of all transcript FPKMs for a gene.  As a result, no "gene length" needs to be calculated.
     92        * NOTE: Some genes, although present in a GTF annotation file, may not get quantified by cufflinks.  This occurs for genes found in very long regions of overlapping genes (which exceed the default value for --max-bundle-length).  When this occurs, the standard err output of cufflinks (contained in the long LSF email when cufflinks is run via 'bsub') will contain the message "Warning: Skipping large bundle."  To correct this (or prevent it in the first place), add an argument like '--max-bundle-length 10000000' to the cufflinks command. You may want to compare the list of genes in the GTF file to that of the cufflinks output to verify that they match.
     93        * If you only want to quantify genes in your GTF file use the -G option (instead of -g which will give also transcripts found by Cufflinks and will take away counts from transcripts in your gtf file).
    9794{{{
    9895            awk -F"\t" '{print $9}' genes.gtf | awk '{print $2}' | perl -pe 's/\"//g;s/;//g' | sort -u > gtf_genes.txt