An Efficient Bayesian Method for Biological Pathway Discovery from High-Throughput Experimental Data

Wei Wang

The discovery of biological pathways is critically important in understanding cell function. Such pathways can involve gene expression levels, protein activation levels, the concentration of small molecules, external conditions (e.g. available nutrients) and other relevant biological processes and states, which are represented as variables. Ultimately, we want to develop a complete functional description of all the biological pathways in a given cell type.

This paper describes a method for discovering biological pathways from data in which one or more biological states are manipulated experimentally. The approach builds on previous work by Wagner, who described a deterministic pathway discovery algorithm. We first describe the assumptions of that approach. We then generalize the approach to create a Bayesian algorithm that combines experimental data with prior belief about biological pathways to produce as output a probability distribution over the causal relationships between each pair of variables. The computational time complexity of the algorithm is quadratic in the number of variables, which makes it feasible to apply it using high throughput data that contain thousands of variables. In this paper, we report results of applying the Bayesian pathway discovery algorithm to gene expression data from a study published by Ideker et al. The results show the algorithm to be promising as a tool for pathway discovery.