Applications in TCGA Data Driven Annotation Filtering
One of the first stages of bioinformatics analysis is the elimination of “filtering” of poor quality measurement points. Users of big data platforms such as gene expression, microRNA target determination, and RNA-seq alignment have numerous web services, feature selection algorithms, and biological driven protocols to select the “correct” data points to represent a biological molecule. This presentation will demonstrate how these “filtering” methods disagree in gene expression and microRNA targeting. An expected utility driven model will also demonstrate how to select the “best filtering” practices by a combinatorial greedy forward selection method with TCGA and other data sources as a guide.