Scott Malec, BS, MLIS, MSIT, PhD

Research Interests: 

His research interests include causal modeling; causal inference from real-world data; knowledge representation; data-knowledge integration (specifically, using computable knowledge from automated machine reading to inform graphical causal models)

Postdoctoral Scholar, Department of Biomedical Informatics

Research Advisor: 


BS (1996, Languages and Humanities) Edinboro University

MLIS (2003, Library and Information Sciences) University of Pittsburgh

MSIT (2010, Information Technology) Carnegie Mellon University

PhD (2018, Biomedical Informatics) University of Texas Health Science Center


Career objective: My work lies at the intersection of AI and epidemiology and focuses on using formal knowledge representations to improve causal inference (the estimation of causal parameters under assumptions) from real-world data, such as EHR. Formal knowledge representations have now advanced to encompass detailed understandings of the pathophysiological and etiological mechanisms underlying disease. One aim of this research is to evaluate the extent to which such formal knowledge could enhance the rigor of observational studies by allowing the identification of sources of systematic bias, such as confounding and selection bias. Detailed knowledge of local causal structures responsible for generating real-world empirical data should allow for causal inference at a higher level of resolution than previously possible. I am extending my knowledge integration approach to study potential preventive and protective agents for reducing Alzheimer's Disease (AD), while my previous work focused on computational narratology and drug safety applications. As part of a team funded partially by a UPitt MOMENTUM grant, I help construct an ontology-based Knowledge Graph combining literature-derived information. This Knowledge Graph will eventually inform graphical causal models of possible preventive agents and AD risk factors using extensive longitudinal observational clinical data. Further research will eventually harmonize with neuroimaging and GWAS assays. The immediate goal for this work would be to help guide the validation and discovery of intervenable targets and more successful preventive strategies to mitigate AD's public health burden.



Malec, S. A., Taneja, S. B., Albert, S. M., Karim, H. T., Levine, A. S., Munro, P. W., Shaaban, C. E., Silverstein, J. S., Witonsky, K. F., Callahan, T. J., & Boyce, R. D. Modeling Alzheimer’s Disease by Combining Knowledge Extracted from the Biomedical Literature with Biomedical Ontologies. (in preparation, to be submitted to the Journal of Biomedical Informatics later in October)


Malec, S. A., Bernstam, E. V., Wei, P., Boyce, R. D., & Cohen, T. Using computable knowledge mined from the literature to elucidate confounders for EHR-based pharmacovigilance. medRxiv 2020.07.08.20113035; doi: (under [major] revision for the Journal of Biomedical Informatics [Submitted August 15, updated September 27th: accepted with Major Revisions with deadline October 27th]) 


Malec, S. A., & Boyce, R. D. (2020). Exploring Novel Computable Knowledge in Structured Drug Product Labels. AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science, 2020, 403–412.


Malec, S. A., Wei, P., Xu, H., Bernstam, E. V., Myneni, S., & Cohen, T. (2017). Literature-Based Discovery of Confounding in Observational Clinical Data. AMIA ... Annual Symposium proceedings. AMIA Symposium, 2016, 1920–1929.


Google Scholar