Universal concept signature analysis: Genome-wide quantification of anew biological and pathological function of genes and pathways
Xu Chi, Maureen A. Sartor, Sanghoon Lee, Meenakshi Anurag, Snehal Patil, Pelle Hall, Matthew Wexler, Xiaosong Wang. Universal concept signature analysis: Genome-wide quantification of anew biological and pathological function of genes and pathways. Briefings in Bioinformatics. 2019 October; 00(00). https://doi.org/10.1093/bib/bbz093
Identifying new gene functions and pathways underlying diseases and biological processes are major challenges in genomics research. Particularly, most methods for interpreting the pathways characteristic of an experimental gene list defined by genomic data are limited by their dependence on assessing the overlapping genes or their interactome topology, which cannot account for the variety of functional relations. This is particularly problematic for pathway discovery from single-cell genomics with low gene coverage or interpreting complex pathway changes such as during change of cell states. Here, we exploited the comprehensive sets of molecular concepts that combine ontologies, pathways, interactions and domains to help inform the functional relations. We first developed a universal concept signature (uniConSig) analysis for genome-wide quantification of new gene functions underlying biological or pathological processes based on the signature molecular concepts computed from known functional gene lists. We then further developed a novel concept signature enrichment analysis (CSEA) for deep functional assessment of the pathways enriched in an experimental gene list. This method is grounded on the framework of shared concept signatures between gene sets at multiple functional levels, thus overcoming the limitations of the current methods. Through meta-analysis of transcriptomic data sets of cancer cell line models and single hematopoietic stem cells, we demonstrate the broad applications of CSEA on pathway discovery from gene expression and single-cell transcriptomic data sets for genetic perturbations and change of cell states, which complements the current modalities.
|Oct 2019 Sanghoon Lee.pdf||5.01 MB|