A fast algorithm for learning epistatic genomic relationships
Genetic epidemiologists strive to determine the genetic profile of diseases. Epistasis is the interaction between two or more genes to affect phenotype. Due to the often non-linearity of the interaction, it is difficult to detect statistical patterns of epistasis. Combinatorial methods for detecting epistasis investigate a subset of combinations of genes without employing a search strategy. Therefore, they do not scale to handling the high-dimensional data found in genome-wide association studies (GWAS). We represent genome-phenome interactions using a Bayesian network rule, which is a specialized Bayesian network. We develop an efficient search algorithm to learn from data a high scoring rule that may contain two or more interacting genes. Our experimental results using synthetic data indicate that this algorithm detects interacting genes as well as a Bayesian network combinatorial method, and it is much faster. Our results also indicate that the algorithm can successfully learn genome-phenome relationships using a real GWAS dataset.