STRT-seq reveals the diversity of brain cell types (PMID: 25700174)

Paper: Cell types in the mouse cortex and hippocampus revealed by single-cell RNA-seq. Science. (2015)
Organism: Mus Musculus
Cell types: somatosensory cortex and hippocampal CA1 region
Sample number: 3,005
Library preparation: STRT-seq
DATA: GSE60361

  1. Detail analysis method about STRT is described here
    a) Remove 3′ bases with low quality, b) Extract barcode from 5′ end, c) Discard reads with <25 base after polyA trimming, d) Discard reads with UMI(Phred score<17), e) At most 9 Gs are removed from 5' end (template switching), f) Discard reads with <6 non-A bases or dinucleotide repaet with <6 other bases
  2. Bowtie (allow 3 mismatches, 24 alternative mapping) / Reference = genome+ERCC sequence / no alighned reads are remapped to known transcript
  3. To annotate 5′ end, the known 5′ ends are extended by 100 bases, but not beyond 3; or another gene with same orientation
  4. uniquely mapped reads are counted in each UMI, multiread with more than one rpeat mappings are assigned randomly at one of them (30% of all are detected as singletons)
  5. UMI with <1/100 of average of the nonzero UMIs are excluded
  6. Raw UMI count is corrected for the UMI collision probability as described here
  7. Detecting noisy gene by comparing CVs from sapmles&ERCC. etc…
  8. Clustering using genes i) remove all genes less than 25 molecules in all, ii) calculate correlation matrix between genes and define a threshold as 90th percentile, remove genes which have less than 5 genes that satisfy this threshold, iii) calculate mean&CV, fit to simple model log2(CV)=log2(mean^a+k), rank all genes by distance from fit line and use only top 5000 genes
  9. BackSPIN = interative and automated version of the SPIN as a 2 way unsupervised clustering
  10. The 9 major clusters are visualized by tSNE
  11. t-test of each genes between any 2 possible groups to exclude genes that are likely to be more specific to other
  12. Standard hierachical clustering with correlation distance and Ward’s linkage show similar to result from BackSPIN
  13. Compare clustering result with affinity propagation
  14. For detecting residual variance after clustering, i) compute variance explainde by each PCs, ii) use broken stick criterion
  15. By BackSPIN, 9 groups are identified
  16. Identify the most specific markers for each class (mostly they play functional role)
  17. 47 molecularly distinct subclasses are identified by repeated biclustering
  18. two types of immune cells are identified, microglia and perivascular macrophages
  19. Findings from RNA-seq are confirmed by smFISH and Immunohistochemistry
Posted in Reference, Single cell and tagged .

Leave a Reply

Your email address will not be published. Required fields are marked *