Single cell RNAs-seq on murine Th17 pathology (PMID: 26607794)

Paper: Single-Cell Genomics Unveils Critical Regulators of Th17 Cell Pathogenicity. Cell. (2015)
Organism: Mus musculus
Cell types: CD4 T-cell
Sample number: 768
Library preparation: Smart-seq
DATA: GSE74833

  1. Th17 cells are harvested from EAE model by CD3+CD4+IL-17A/GFP+
  2. In vitro Th17 cells are harvested after 48hr activation with TGFb1+IL-6 or IL-1b+IL-6+IL-23
  3. 100PE/1-2M reads/sample, TopHat / Cufflinks / Log2(1+FPKM)
  4. QC: i)number of reads, ii)number of aligned, iii)ratio of aligned, iv)ratio of identified transcripts vs all identified, v)ratio of duplicates, vi)primer contamination, vii)mean insert size, viii)SD insert size, ix)complexity, x)ratio of ribosomal, xi)ratio of coding, xii)ratio of UTR, xiii)ratio of intronic, xiv)ratio of intergenic, xv)ratio of mRNA, xvi)CV of coverage, xvii)mean 5′ bias, xviii)mean 3′ bias, xix)mean ratio of 5′ to 3′
  5. for excluding poor samples, use maximum{average(x)-1.645*standard deviation(x), median(x)-1.645*median absolute deviation} for number of aligned, ratio of aligned or ratio of identified transcripts (for latter 2, Gaussian mixture model is also used)
  6. as hard lower bounds: aligned reads>25,000, ratio of aligned>20%, ratio of identified transcripts>20%
  7. Normalization by QC i)-xix) using Risso et al., 2011 (global-scaling normalization to remove the effects of the top PCs until covering >90% of variance in QCs)
  8. Main PCs highly correlate with library quality scores
  9. Batch reduction by COMBAT
  10. For false negatives: extended version of probabilistic weighting strategy is used (Shalek et al. 2014), down-weight the contribution of less reliably measured transcripts
  11. 7000 expressed genes with FPKM>10 in 20% of cells in vitro, 4000 in vivo
  12. Correlation between single cells in the same condition are 0.45-0.75
  13. RNA expression is confirmed by RNA Flow-FISH (QuantiGene® ViewRNA ISH Cell Assay kit from Affymetrix, ImageStream X MkII)
  14. Compare PCA from normalized data with known T cells states
  15. Identify TFs that may contribute to heterogeneity, detect factors whose targets are strongly enriched in genes that correlated with each PC (PC1 positively correlates with a signature of memory T-cells etc)
  16. define Moroni regions, identify genes characterizing each group and assign new labels to each
  17. self-renewing-like state in LN -> pre-Th1 effector-like phenotype in the LN and CNS -> Th1-like effector state and a Th1-like memory state in the CNS -> less functional state in the CNS
  18. In-vitro-differentiated Th17 cells vary strongly in a key pathogenicity signature
  19. Signature from IL-23R / cells differentiated with IL-1b+IL-6+IL-23 correlates highly with the more regulatory cells, confirming the role of the IL-23 pathway in pathogenicity
  20. Cells derived in the non-pathogenic conditions similar to Th17 self-renewing-like signature (p <10^-10, KS test), whereas those derived in pathogenic conditions resemble the Th17/Th-1 like memory phenotype identified in the CNS (p <10^-3)
  21. In TGFb+IL-6 vitro condition, 35% of the detected genes are expressed in >90% of the cells with unimodal distribution including housekeeping genes
  22. Bimodally expressed genes with high expression in >20% cells include cytokines and receptors (figure x-Expression range vs y-Genes)
  23. Key immune genes varies more than that of other transcripts
  24. Calculate significant co-variation (Spearman with FDR < 0.05) between bimodally expressed transcripts (expressed by less than 90% of cells), [IL17A,CCL20] and [IL10,IL24,IL9]
  25. co-variation modules identify novel putative regulator, some of which don’t appear in bulk samples
  26. Select 4 genes, Gpr65, Toso, Pizp and Cd5l based on ranking scheme and availability of KO mouse. Confirm the impairment of Th17 differentiation without these 4 genes.
  27. I will add the scheme of ranking analysis in detail later.
