Dr Ian Overton: Biomedical Systems Analysis
Integrative Network Biology
Summary
We are applying data integration and machine learning approaches to develop probabilistic networks of gene function. Key areas of interest are cell invasiveness, migration and 'stemness', including systems-wide modelling of transitions between Epithelial and Mesenchymal morphologies (EMT, MET). Indeed, EMT and MET are fundamental in development (e.g. neural crest migration), and implicated in cancer progression as well as stem-like properties. A network biology approach is fundamental to understanding the control of cell behaviour, and important for developing more effective and personalised medicine; including identification of therapeutic targets and indicators of response to treatment.
Therapeutic Targets and Diagnostic Biomarkers for Cancer Medicine
We are working towards discovery of candidate diagnostics and selective therapeutic targets by exploiting the overlapping biochemistry of cancers and development. Cell invasiveness and metastasis are a major focus. Indeed, metastasis is a crucial factor in the majority of cancer-related deaths. Epithelial-Mesenchymal Transition (EMT) is a cellular programme where epithelial cells acquire mesenchymal characteristics, and is linked to the metastasis of common cancers (e.g. breast, colorectal). Epithelial remodelling is fundamental for developmental processes (e.g. gastrulation, neural crest migration), and is implicated in cell survival, invasiveness and migration. Stem cells have been important for studying EMT, while developmental signalling pathways (e.g. Hedgehog, Wnt/β-catenin, TGFβ) are known to play key roles. EMT is canonically characterised by activation of Snail-family transcription factors. The interplay of various pathways effecting Snail activation and other EMT characteristics (e.g. loss of adhesion) is not fully understood, but pathway interactions are fundamental to EMT signalling. We are employing machine learning over functional genomics datasets to generate systems-wide networks of EMT and MET, which are applied to interrogate cancer datasets. This approach aims to identify novel players and pathway interactions in EMT, MET and related processes, invigorating the discovery of new clinical tools for cancer medicine.
Key Publications
- *Overton, I.; Graham, S.; Gould, K.; Hinds, J.; Botting, C.; Shirran,
S.; Barton, G. and *Coote, P. Global Network Analysis of Drug Tolerance,
Mode of Action and Virulence in Methicillin-Resistant S. aureus. *Corresponding authors. BMC Systems Biology (in press), 2011. - Overton, I.; Padovani, G.; Girolami, M. and Barton, G. ParCrys: A parzen window density estimation approach to protein crystallisation propensity prediction. Bioinformatics 24:901- 907, 2008
PubMed Abstract - Overton, I.; vanNiekerk, C.; Carter, L.; Dawson, A.; Martin, D.; Cameron, S.; McMahon, S.; White, M.; Hunter, W.; Naismith, J. and Barton, G. TarO: A Target Optimisation System for Structural Biology. Nucleic Acids Research 36:W190-W196, 2008 PubMed Abstract
- Overton, I. and Barton, G. A normalised scale for structural genomics target ranking: The OB- Score. FEBS Letters 580, 4005-4009, 2006 PubMed Abstract
- Hubbard, S.; Grafham, D.; Beattie, K.; Overton, I.; McLaren, S.; Croning, M.; Boardman, P.; Bonfield, J.K.; Burnside, J.; Davies, R.M.; Farrell, E.R.; Francis, M.D.; Griffiths-Jones, S.; Humphray, S.J.; Hyland, C.; Scott, C.E.; Tang, H.; Taylor, R.G.; Tickle, C.; Brown, W.R.; Birney, E.; Rogers, J. and Wilson, S.A. Transcriptome analysis for the chicken based on 19,626 finished cDNA sequences and 485,337 expressed sequence tags. Genome Research 15, 174-183, 2005 PubMed Abstract
- Wong, G.K.; … Overton, I.; … and Yang, H. A genetic-variation map for chicken with 2.8 million single-nucleotide polymorphisms. Nature 432, 717-722, 2004
PubMed Abstract - Hillier, L.W.; … Overton, I.; … and Wilson, R.K. 2004. Sequence and comparative analysis of the chicken genome provides unique perspectives on vertebrate evolution. Nature 432, 695-716, 2004 PubMed Abstract
Lab Members
Current lab members involved in this work are:
- Dr Ian Overton
- Dr Lel Eory
- Mr Alex Lubbock (PhD Student)
Approach and Future Work
Broadly, our work combines molecular biology and computer science to deepen understanding of cell behaviour in health and disease. There is a huge quantity of information available to biological scientists, including the DNA sequences of hundreds of organisms. However, making sense of the billions of DNA bases is not trivial; on top of that, DNA sequencing represents a fraction of the available biological data (e.g. transcriptome, proteome profiling). This energises our research, which applies computational approaches necessary to draw together and make sense of the information held in these datasets - in order to generate scientific advances for the benefit of human health. Biological processes are organised and controlled by complex interactions between many individual components, and so inherently involve intricate networks. The properties of these networks underlie virtually all aspects of cell function and can, for example, predict disease outcome and responses to treatment. Integrative network biology is therefore fundamental to understanding cell fate regulation, and important for developing the next generation of clinical tools.
Fig 1. A Sketch of Computational Biology Research
Click on image for larger view. This figure offers a general perspective on computational biology research, outlining the relationships of biological information, in silico analysis and the generation of understanding. The cycle of knowledge, experimentation and modelling is synonymous with scientific inquiry; computational approaches are vital to modern biology, which interprets large datasets to gain insight into complex systems.
Systems-wide Functional Association Networks
Considerable numbers of genes (20-50%) do not have confidently assigned function in current genome annotations. Indeed, many genes of unknown function are coordinately regulated in cellular processes (e.g. differentiation). Even for the set of genes with assigned functions, a substantial fraction of the total pleiotropy likely remains undiscovered. Genome-scale datasets afford unprecedented opportunity for probabilistic mapping of gene functional relationships in order to address this knowledge gap. We are applying machine learning over large datasets to produce networks of gene functional association and transcriptional regulation. These networks are interrogated by graph theoretic approaches in order to gain insights into signalling and metabolism in development and diseases (e.g. cancers, neurodegeneration, MRSA).
