Gregory Cooper, MD, PhD

Room 524
5607 Baum Boulevard
Pittsburgh, PA 15206
Phone Number: 
Admin Support: 

Research Interests

  • Application of decision theory, probability theory, Bayesian statistics, and artificial intelligence to biomedical informatics research problems 
  • Causal modeling and discovery from clinical and high-throughput molecular data
  • Computer-aided medical diagnosis and prediction
  • Machine-learning approaches to improving patient safety
  • Biosurveillance of disease outbreaks


Appointments and Positions

  • Professor of Biomedical Informatics
  • Secondary faculty appointments in Intelligent Systems, Computational and Systems Biology, Computer Science and Information Sciences
  • Vice Chair, Department of Biomedical Informatics


Current Research Projects and Collaborations

Dr. Cooper’s past and current research involves the application of decision theory, probability theory, machine learning, Bayesian statistics, and artificial intelligence to biomedical informatics research problems. He has been investigating those topic areas for the past 25 years and has published over 140 peer-reviewed papers. He is currently involved in the following research projects:

Probabilistic Disease Surveillance This project is developing and evaluating a probabilistic approach to disease surveillance. The goal of the research is to improve the ability of public health officials and physicians to estimate the current incidence of influenza and other infectious diseases and to predict the future course of epidemics of those diseases. This improved information is expected to better support decisions made by health departments to control epidemics, which is expected to reduce morbidity and mortality from epidemic diseases.

Identifying Multivariate Statistical Differences Between Groups This project isinvestigating a novel approach to the problem of detecting multivariate statistical differences across groups of data, which arises in a wide variety of settings. Such circumstances occur naturally in observational studies, where, for example, a clinical researcher may observe a difference in the prevalence of a condition between two groups of patients and would like to explore the reasons behind the difference. Another example is comparative effectiveness research, where it is of interest to understand an observed difference between two clinical treatment approaches.

Bayesian Rule Learning Methods for Disease Prediction and Biomarker DiscoveryThis project is applying existing Bayesian rule learning methods to high-throughput molecular data (e.g., proteomic and genomic data) to perform disease prediction and biomarker discovery. Datasets being analyzed include those in the domains of lung cancer, breast cancer, amyotrophic lateral sclerosis, and other diseases.

Detecting Deviations in Clinical Care in ICU Data Streams The goals of this project are to develop, implement, and evaluate computer-based methods that model usual clinical care and then apply those models to detect individual patient care that is anomalous. In the future, such a system may serve as an  “safety net” that continuously monitors patient care, as documented in an EMR, and raises an alert when such care appears to be anomalous. An hypothesis of the project is that such anomalies correspond to medical errors often enough to make such alerting worthwhile. Within the ICU domain the project is investigating the extent to which this hypothesis is supported.

Causal Modeling and Discovery of Biomedical Knowledge from Big DataMuch of science consists of discovering and modeling causal relationships that occur in nature. There is a pressing need for methods that can efficiently infer causal networks from large and diverse types of biomedical data and background knowledge. This project is developing, implementing, and evaluating an integrated set of tools that support the discovery and sharing of causal knowledge from very large and complex biomedical data, including both observational and experimental data. Areas of investigation include the discovery of the genomic drivers of cancers and of the cell-signaling pathways in those cancers.


Recent Publications

Sverchkov Y, Jiang X, Cooper GF. Spatial cluster detection using dynamic programming. BMC Medical Informatics & Decision Making 12 (2012). doi:10.1186/1472-6947-12-22. PMID: 22443103. PMC: PMC3403878

Batal I, Cooper G, Hauskrecht M. A Bayesian scoring technique for mining predictive and non-spurious rules. Machine Learning and Knowledge Discovery in Databases 7524 (2012) 260-276.

Hennings-Yeomans PH, Cooper GF. Improving the prediction of clinical outcomes from genomic data using multiresolution analysis. IEEE/ACM Transactions on Computational Biology and Bioinformatics 9 (2012) 1442-1450. PMID: 22641708

Sverchkov Y, Visweswaran S, Clermont G, Hauskrecht M, Cooper GF. A multivariate probabilistic method for comparing two clinical datasets. In: Proceedings of the ACM International Health Informatics Symposium (2012) 795-800.

Batal I, Valizadegan H, Cooper GF, Hauskrecht M. A temporal pattern mining approach for classifying electronic health record data. ACM Transactions on Intelligent Systems and Technology 4 (2013) article 63.

Villamarin R, Cooper G, Wagner M, Tsui FC, Espino JU. A method for estimating from thermometer sales the incidence of diseases that are symptomatically similar to influenza. Journal of Biomedical Informatics 46 (2013) 444-457. PMID: 23501015

Hauskrecht M, Batal I, Valko M, Visweswaran S, Cooper GF, Clermont G. Outlier-detection for patient monitoring and alerting. Journal of Biomedical Informatics 46 (2013) 47-55. PMID: 22944172. PMCID: PMC3567774

Ferreira A, Cooper GF, Visweswaran S. Decision path models for patient-specific modeling of patient outcomes.  In:Proceedings of the Annual Symposium of the American Medical Informatics Association (2013) 413-21. PMID: 24551347. PMCID: PMC3900188

Montefusco DJ, Chen L, Matmati N, Lu S, Newcomb B, Cooper GF, Hannun YA, Lu X. Distinct signaling roles of ceramide species in yeast revealed through systematic perturbation and systems biology analyses. Science Signaling 6 (299), rs14. [DOI: 10.1126/scisignal.2004515]  (2013) PMID: 24170935 PMCID: PMC3974757