查看原文
其他

Y叔 2018-06-01

先来一段搞笑的街头卖膏药的视频,大家可以娱乐一下再继续往下看,因为我接下来就是要卖Y叔版「膏药」!

https://v.qq.com/txp/iframe/player.html?vid=i05431hkegx&width=500&height=375&auto=0

大家喜欢clusterProfiler,除了功能强大、支持广泛之外,我想还有一点必须是可视化,毕竟大家都是视觉动物,颜值即正义。然而这些都是我早期的代码,我其实一直想重新写,希望可以全部用gplot2来实现,方便后续维护、更好看、更强大。而这在我博士毕业后,终于找了个时间重写了一遍,并且也加入了部分新的图形,我把这些代码重新打包,已经在Bioconductor上,叫enrichplot。有了这个包,你们更加对clusterProfiler系列包无法自拔,让其它的工具毫无颜色。

The enrichplot package implements several methods for enrichment result visualization to help interpretation. It supports both hypergeometric test and gene set enrichment analysis. Both of them are widely used to characterize pathway/function relationships to elucidate molecular mechanisms from high-throughput genomic data.

The enrichplot package supports visualizing enrichment results obtained from
DOSE (Yu et al. 2015), clusterProfiler (Yu et al. 2012), ReactomePA (Yu and He 2016) and meshes.

Induced GO DAG graph

Gene Ontology (GO) is organized as a directed acyclic graph. An insighful way of looking at the results of the analysis is to investigate how the significant GO terms are distributed over the GO graph. The goplot function shows subgraph induced by most significant GO terms.

library(clusterProfiler)
data(geneList, package="DOSE")
de <- names(geneList)[abs(geneList) > 2]
ego <- enrichGO(de, OrgDb = "org.Hs.eg.db", ont="BP", readable=TRUE)
library(enrichplot)
goplot(ego)

Bar plot

Bar plot is the most widely used method to visualize enriched terms. It depicts the enrichment scores (e.g. p values) and gene count or ratio as bar height and color.

barplot(ego, showCategory=20)

Dot plot

Dot plot is similar to bar plot with the capability to encode another score as dot size. Both barplot and dotplot supports facetting to visualize sub-ontologies simultaneously.

dotplot(ego, showCategory=30)

go <- enrichGO(de, OrgDb = "org.Hs.eg.db", ont="all")
dotplot(go, split="ONTOLOGY") + facet_grid(ONTOLOGY~., scale="free")

Gene-Concept Network

Both the barplot and dotplot only displayed most significant enriched terms, while users may want to know which genes are involved in these significant terms. The cnetplot depicts the linkages of genes and biological concepts (e.g. GO terms or KEGG pathways) as a network.

## remove redundent GO terms
ego2 <- simplify(ego)
cnetplot(ego2, foldChange=geneList)

cnetplot(ego2, foldChange=geneList, circular = TRUE, colorEdge = TRUE)

UpSet Plot

The upsetplot is an alternative to cnetplot for visualizing the complex association between genes and gene sets. It emphasizes the gene overlapping among different gene sets.

upsetplot(ego)

Heatmap-like functional classification

The heatplot is similar to cnetplot, while displaying the relationships as a heatmap. The gene-concept network may become too complicated if user want to show a large number significant terms. The heatplot can simplify the result and more easy to identify expression patterns.

heatplot(ego2)

heatplot(ego2, foldChange=geneList)

Enrichment Map

Enrichment map organizes enriched terms into a network with edges connecting overlapping gene sets. In this way, mutually overlapping gene sets are tend to cluster together, making it easy to identify functional module.

emapplot(ego2)

ridgeline plot for expression distribution of GSEA result

The ridgeplot will visualize expression distributions of core enriched genes for GSEA enriched categories. It helps users to interpret up/down-regulated pathways.

kk <- gseKEGG(geneList, nPerm=10000)
ridgeplot(kk)

running score and preranked list of GSEA result

Running score and preranked list are traditional methods for visualizing GSEA result. The enrichplot package supports both of them to visualize the distribution of the gene set and the enrichment score.

gseaplot(kk, geneSetID = 1, by = "runningScore", title = kk$Description[1])

gseaplot(kk, geneSetID = 1, by = "preranked", title = kk$Description[1])

gseaplot(kk, geneSetID = 1, title = kk$Description[1])

References

Yu, Guangchuang, and Qing-Yu He. 2016. “ReactomePA: An R/Bioconductor Package for Reactome Pathway Analysis and Visualization.” Molecular BioSystems 12 (2): 477–79. doi:10.1039/C5MB00663E.

Yu, Guangchuang, Li-Gen Wang, Yanyan Han, and Qing-Yu He. 2012. “clusterProfiler: an R Package for Comparing Biological Themes Among Gene Clusters.” OMICS: A Journal of Integrative Biology 16 (5): 284–87. doi:10.1089/omi.2011.0118.

Yu, Guangchuang, Li-Gen Wang, Guang-Rong Yan, and Qing-Yu He. 2015. “DOSE: An R/Bioconductor Package for Disease Ontology Semantic and Enrichment Analysis.” Bioinformatics 31 (4): 608–9. doi:10.1093/bioinformatics/btu684.

    您可能也对以下帖子感兴趣

    文章有问题?点此查看未经处理的缓存