Max Planck Institute for Molecular Genetics

 Department of Computational Molecular Biology

Home page

NOTE: We moved August 2009 to http://bioinformatics.rutgers.edu.

Home page  Contact us  Site map 

 

 

 

Mixture model based group inference in fused genotype and phenotype data

B. Georgi, M. A. Spence, P. Flodman, and A. Schliep

Studies in Classification, Data Analysis, and Knowledge Organization, Springer, 2007

The analysis of genetic diseases has classically been directed towards establishing direct links between cause, a genetic variation, and effect, the observable deviation of phenotype. For complex diseases which are caused by multiple factors and which show a wide spread of variations in the phenotypes this is unlikely to succeed. One example is the Attention Deficit Hyperactivity Disorder (ADHD), where it is expected that phenotypic variations will be caused by the overlapping effects of several distinct genetic mechanisms. The classical statistical models to cope with overlapping subgroups are mixture models, essentially convex combinations of density functions, which allow inference of descriptive models from data as well as the deduction of groups. An extension of conventional mixtures with attractive properties for clustering is the context-specific independence (CSI) framework. CSI allows for an automatic adaption of model complexity to avoid overfitting and yields a highly descriptive model.