Utilizing the Wikidata system to improve the quality of medical content in Wikipedia in diverse languages: a pilot study

Pfundner A, Schönberg T, Horn J, Boyce RD, Samwald M. Utilizing the Wikidata system to improve the quality of medical content in Wikipedia in diverse languages: a pilot study. Journal of Medical Internet Research. 2015; 17(5). http://www.jmir.org/2015/5/e110/ .PMCID: PMC4468594


Wikipedia is an important source of medical information for both patients and medical professionals. Given its wide reach, improving the quality, completeness, and accessibility of medical information on Wikipedia could have a positive impact on global health.


We created a prototypical implementation of an automated system for keeping drug-drug interaction (DDI) information in Wikipedia up to date with current evidence about clinically significant drug interactions. Our work is based on Wikidata, a novel, graph-based database backend of Wikipedia currently in development.


We set up an automated process for integrating data from the Office of the National Coordinator for Health Information Technology (ONC) high priority DDI list into Wikidata. We set up exemplary implementations demonstrating how the DDI data we introduced into Wikidata could be displayed in Wikipedia articles in diverse languages. Finally, we conducted a pilot analysis to explore if adding the ONC high priority data would substantially enhance the information currently available on Wikipedia.


We derived 1150 unique interactions from the ONC high priority list. Integration of the potential DDI data from Wikidata into Wikipedia articles proved to be straightforward and yielded useful results. We found that even though the majority of current English Wikipedia articles about pharmaceuticals contained sections detailing contraindications, only a small fraction of articles explicitly mentioned interaction partners from the ONC high priority list. For 91.30% (1050/1150) of the interaction pairs we tested, none of the 2 articles corresponding to the interacting substances explicitly mentioned the interaction partner. For 7.21% (83/1150) of the pairs, only 1 of the 2 associated Wikipedia articles mentioned the interaction partner; for only 1.48% (17/1150) of the pairs, both articles contained explicit mentions of the interaction partner.


Our prototype demonstrated that automated updating of medical content in Wikipedia through Wikidata is a viable option, albeit further refinements and community-wide consensus building are required before integration into public Wikipedia is possible. A long-term endeavor to improve the medical information in Wikipedia through structured data representation and automated workflows might lead to a significant improvement of the quality of medical information in one of the world’s most popular Web resources.

Keywords: Internet, Wikipedia, drug information services, semantic networks, medical informatics, drug interactions
Publication Year: 
Faculty Author: 
Publication Credits: 
Pfundner A, Schönberg T, Horn J, Boyce RD, Samwald M