Department of Biomedical Informatics - University of Pittsburgh

Archived Talks

University of Pittsburgh Department of Biomedical Informatics Lecture Series

“Efficient Bayesian Discretization and its Application to Biomedical Datasets”

Jonathan Lustgarten, MS
Biomedical Informatics Doctoral Fellow

Abstract: We propose a fast spatial clustering algorithm for rapid detection of emerging disease outbreaks prospectively. This method applies Wavelet transform to de-trend time series data which provides stationary baseline values with probably seasonal or other systematic effects lessened or even removed. We study a new time series which is computed by averaging all normalized time series of the areas within each cluster to test the significance, and demonstrate the performance of this algorithm in terms of sensitivity and specificity of detecting emerging outbreaks by comparing this algorithm with three others used at present.


“Wavelet-Based Spatial Clustering for Outbreak Detection”

Jialan Que, MS
Intelligent System Program/Biomedical Informatics Doctoral Fellow

Abstract: Many machine learning algorithms that learn classifiers require data that are discrete. We introduce a new efficient Bayesian discretization (EBD) algorithm that uses dynamic programming and runs in O(N2) time to convert continuous data to discrete. We compared the performance of EBD to Fayyad and Irani’s Minimum Description Length Principle Criterion (MDLPC) discretization method, which is commonly used for discretization in biomedical datasets. On 23 biomedical datasets obtained from high-throughput genomic and proteomic analyses, the classification performance of the Naive Bayes classifier was statistically significantly better when the attributes were discretized using EBD over MDLPC.

Friday, February 8, 2008
11:00 AM to 12:00 NOON
Parkvale Building (200 Meyran Avenue), Classroom M-184 (on the mezzanine level)

For more information: jxc3@pitt.edu or 412.647.7113