Gene-environment interactions and disease


Bayesian network modeling of gene-environment interactions and cancer susceptibility

Duration and funding source: 5 years (September 2011 – August 2016) ; National Institutes of Health

Personnel: M. Borsuk (PI), C. Su, A. Andrew, M. Karagas, J. Moore (Dartmouth)

Synopsis: Studies of the relation between genetic traits and cancer susceptibility are often inconclusive or conflicting. This is likely due to the challenges of accommodating multiple genetic and environmental risk factors using traditional analytic models. Each risk factor is likely to contribute to susceptibility through a combination of additive and non-additive interactions with other risk factors, and such interactions are not often addressed by conventional methods. Additionally, data from single studies rarely allow for conclusive identification of causal relationships in such complex systems. Yet, there is often a wealth of knowledge available from previous studies that could be brought to bear on the task of model building. In this project, we explore the applicability of Bayesian networks for  combining this prior knowledge with observational data to infer causal relations.

Our first step is to develop the causal Bayesian network approach for application to epidemiological data on genes, the environment, and cancer incidence. Our previous work suggests that Bayesian networks are appropriate and powerful tools for analyzing such data. We will then apply the developed algorithms and prior knowledge encoding techniques to existing data from a large population-based, case-control study on bladder cancer. This will allow us to test the practicality, usefulness, and efficacy of the approach on real-world data. The biologic plausibility of the identified relations will next be interpreted and then quantified in detail using mechanistically based interaction models. As a final validation step, the impact of influential polymorphisms on targeted biological processes will be confirmed using assays for expression and function. We anticipate that by combining recent advances in the use of Bayesian networks for statistical causal inference with a rich genetic epidemiological data set and focused laboratory experiments, the proposed project will yield new methods for revealing how genes and the environment interact to determine cancer risk.