Tumors are heterogeneous mixtures of normal and cancerous cells with distinct genetic and transcriptional profiles. In this talk, I will present several computational approaches to quantify tumor heterogeneity and reconstruct tumor evolution using data from bulk, single-cell, and spatial sequencing technologies. For bulk and targeted single-cell DNA sequencing, I will describe methods that reconstruct tumor evolution using both somatic single-nucleotide mutations
Can AI, ML and Data Science help help prevent children from getting lead poisoning? Can it help reduce police violence and misconduct? Can it increase vaccination rates? Can it help cities better prioritize limited resources to improve lives of citizens and achieve equity? We’re all aware of the potential of ML and AI but turning this potential into tangible social impact, and more importantly equitable social impact, is not straightforward.
Real world data (RWD), meaning data generated by providers and payors during the delivery of patient care, are increasingly looked to as a source of truth for the development of practice-changing real-world evidence (RWE). Uses include the development of predictive models, observational cohort analyses and even individual-level data collection to support randomized clinical trials. RWE generated using RWD holds promise for being substantially more time and cos
Data scientists have been at the forefront of helping to resolve the COVID pandemic. Their roles to address numerous critical questions have been integral to finding solutions to the pandemic. Our team has responded on numerous fronts but has primarily focused efforts on the design and analysis of clinical trials to establish optimal treatments for COVID-19 in the outpatient setting.
Since the publication of the FAIR principles in 2016, the scientific community has been awash with efforts to make its data findable, accessible, interoperable, and reusable. There is enormous peer pressure to assert that one’s online experimental data are already FAIR and that one’s data are more FAIR than those of the next person. Alas, most online data are not FAIR, and attempts to measure FAIRness in a systematic way have gone nowhere.
Nearly twenty methods have been developed to infer gene regulatory networks (GRNs) from single-cell RNA-seq data. An experimentalist seeking to analyze a new dataset faces a daunting task in selecting an appropriate inference method since there are no widely accepted ground-truth datasets for assessing algorithm accuracy and the criteria for evaluation and comparison of methods are varied.
Regulatory approval of a medical product considers both benefits and harms that can be measured by multiple endpoints. The importance of these endpoints may vary among patients. Current approaches integrate multiple outcomes without reflecting heterogeneity of patient preferences. In this paper, we proposed a new composite desirability of outcome ranking (DOOR) to define a winning probability.
Long read sequencing is now well established for producing high quality reference genomes, including the first gap-free Telomere-to-Telomere assembly of a human genome. Thanks to substantial improvements in throughput, costs, and quality, long read sequencing is starting to be used for population-scale analysis of clinical genomes, especially to develop a detailed analysis of structural variants present. Here Dr.
Human microbiome research at population scale is now capable of achieving the molecular detail needed to integrate gut microbial community profiles with biochemical, environmental, and human regulatory and immunological responses.
Mobile element insertions (MEIs), including Long interspersed element-1 (L1), Alu, and SVA (SINE-VNTR-Alu) retrotransposons, comprise approximately 46% of the human genome and have been shown to play an important role in human development and disease.