Population-scale analysis of structural variants

Michael Schatz, PhD
Johns Hopkins University

Long read sequencing is now well established for producing high quality reference genomes, including the first gap-free Telomere-to-Telomere assembly of a human genome. Thanks to substantial improvements in throughput, costs, and quality, long read sequencing is starting to be used for population-scale analysis of clinical genomes, especially to develop a detailed analysis of structural variants present. Here Dr. Schatz will discuss the bioinformatic software needed to take advantage of these data, focusing on the accurate identification and comparison of variants in tumor-normal samples, family pedigrees, and large populations. He will also discuss how he has used these approaches to improve the resolution of germline and somatic variants in cancer and other human genetic diseases. Finally, he will describe the future of genomic data analysis and sharing using the NHGRI Genomic Data Science Analysis, Visualization, and Informatics Lab-Space (AnVIL) platform.