Max Planck Institute for Molecular Genetics

 Department of Computational Molecular Biology

Home page

NOTE: We moved August 2009 to http://bioinformatics.rutgers.edu.

Home page  Contact us  Site map 

 

 

Software under development

ClusterViz: Cluster Visualisation (http://clusterviz.sourceforge.net/)

ClusterViz is a software to visualize the clustering process using the family of k-means algorithms

GHMM: General Hidden Markov Model library (http://ghmm.org/)

The General Hidden Markov Model library (GHMM) is a freely available LGPL-ed C library implementing efficient data structures and algorithms for basic and extended HMMs. The development is hosted at Sourceforge http://sourceforge.net/projects/ghmm/, where you have access to the Subversion repository, mailing lists and forums.

Gato: Graph Animation Toolbox (http://gato.sf.net)

Gato - the Graph Animation Toolbox - is a software which visualizes algorithms on graphs. Graphs are mathematical objects consisting of vertices and edges connecting pairs of vertices: think of cities as vertices and interstates as edges connecting two cities. Algorithms might find a shortest path - the fastest route -- or a minimal spanning tree or solve one of other interesting problems on graphs: maximal-flow, weighted and non-weighted matching and min-cost flow. Visualization means linking cause - the statements of an algorithm - immediately to an effect - changes to the graph the algorithm has as its input - by terms of blinking, changing colors and other visual effects.

GQL: Graphical Query Language (http://ghmm.org/gql)

GQL is a suite of tools for analyizing time-course experiments. Currently, it is adapted to gene expression data. The two main tools are GQLQuery, for querying data sets, and GQLCluster, which provides a way for computing groupings based on a number of methods (model-based clustering using HMMs as cluster models and estimation of a mixture of HMMs).

PyMix: The Python mixture package (http://www.pymix.org)

The Python Mixture Package is a freely available Python library implementing algorithms and data structures for a wide variety of data mining applications with basic and extended mixture models.

pGQL: probabilistic Graphical Query Language

pGQL is a software tool in particular for analyzing gene expression time courses. It allows its user to interactively define linear HMM queries on time course data using rectangular graphical widgets called probabilistic time boxes. The analysis is fully interactive and the graphical display shows the time courses along with the graphical query. The results can be submitted to gPROF directly from pGQL.

Completed software projects

MCPD: Markov Chain Pooling Decoder

The Markov Chain Pooling Decoder (MCPD) is used in the analyis of pooling experiments for library screening. Pools are collections of clones, and screening a pool with a probe is a group test, determining whether any of these clones are positive for the probe. The results of the pool screenings are interpreted, or decoded, to infer which clones are candidates to be positive using a Markov chain Monte Carlo approach. MCPD implements this MCMC to compute marginal probabilities of clones using a Bayesian model for the experiment.

Proclust: Protein clustering by transitive homology

Proclust is software package for clustering protein sequences with a graph-based approach which significantly increases the numbers of remote homolog proteins detected. You can use the online server at the ZAIK, University of Cologne or download the software. Proclust is released under the GPL.

PBQ: The Python Batch Queue

PBQ is a simple batch queue system, with the goal of completing a list of jobs on a bunch of machines with a shared file system without interfering with interactive users and/or more important batch jobs. Most importantly, you do not need to be root to install or use it. PBQ is distributed under the GNU Public License (GPL).