Dr Martin Taylor: Biomedical Systems Analysis
Evolutionary Genetics and Genomics
Summary
We are endeavouring to understand the processes of selection and mutation that are acting to shape genomes. It is well known that selection shapes the pattern of genomic changes that accumulate as populations diverge from a common ancestor, but it is becoming increasingly clear that variation in the pattern of new mutations can also generate superficially similar signals. Improved separation of these confounding evolutionary signatures is crucial if we are to understand how organisms evolve, and to relate contemporary genetic changes to human biology and disease.
Approach and Future Work
We work at the intersection of population genetics, evolutionary genomics and functional genomics, integrating and processing large (genome scale) datasets. Genetic changes that have accumulated between species tell us about the combined effects of both mutation pattern and selection. In contrast, mutations that are still segregating in populations (as rare variants or polymorphisms) show the same impact of mutation but the consequences of selection are more subtle. We leverage the differences of between-species and within-species variation to separate the patterns of mutation from those imposed by selection. Doing this in the context of functional annotation allows us to relate the packing, regulation and functions of the genome to both mutation patterns and the action of selection.
Our experiments are performed in silico though we collaborate closely with a number of laboratory based and clinical research groups. We have been using high-throughput 2nd/next generation sequencing datasets for functional genomic measures for some time, but increasingly we are utilising this data for population genetic and evolutionary analyses. Our work tends to be computationally intensive, necessitating access to the fastest computers we can get our hands on. The integration and processing of large datasets means almost all of our analysis is managed through relational database systems (Taylor et al. Bioinformatics 2007).
Into the future we will be applying evolutionary insights and methodology to understand the phenotypic consequences of genetic changes. In particular we are applying these methods to identify genes that harbour rare genetic variants that contribute to human disease, and to identifying the driver mutations that lead to the progression of cancer (Talavera et al, Proteins 2010).
Key Publications
- Talavera, D.; Taylor, M.S. and Thornton, J.M. The (non)-malignancy of cancerous amino acid substitutions. Proteins 78:518-529, 2010 PubMed Abstract
- Semple, C.A.M. and Taylor, M.S. Molecular biology: The structure of change. Science 323:347-348, 2009 PubMed Abstract
- FANTOM Consortium: Suzuki, H.;...Semple, C.A.;...Taylor, M.S...and Hayashizaki, Y. The transcriptional network that controls growth arrest and differentiation in a human myeloid leukemia cell line. Nat Genet 41(5):553-562, 2009 PubMed Abstract
- Taylor, M.S.; Massingham, T.; Carninci, P.; Hayashizaki, Y.; Goldman, N. and Semple, C.A.M. Rapidly evolving human promoter regions. Nature Genetics 40:1262-1263, 2008 PubMed Abstract
- Taylor, M.S.; Valdar, W.; Kumar, A.; Flint, J. and Mott, R. Management, presentation and interpretation of genome scans using GSCANDB. Bioinformatics 23:1545-1549, 2007 PubMed Abstract
- Carninci, P.;...Semple, C.A.; Taylor, M.S.... and Hayashizaki, Y. Genome-wide analysis of mammalian promoter architecture and evolution. Nat.Genet 38(6):626-635, 2006 PubMed Abstract
- Taylor, M.S.; Kai, C.; Kawai, J.; Carninci, P.; Hayashizaki, Y. and Semple, C.A.M. Heterotachy in Mammalian Promoter Evolution. PLoS Genetics 2(4):e30, 2006 PubMed Abstract
- Valdar, W.; Solberg, L.C.; Gaugier, D.; Burnett, S.; Klenerman, P. Cookson, W.O. Taylor, M.S.; Rawlins, J.N.P.; Mott, R. and Flint J. Genome-wide genetic association of complex traits in heterogeneous stock mice. Nature Genetics 38:879-887, 2006 PubMed Abstract
- Carninci, P.;...Taylor, M.S....and Hayashizaki, Y. The transcriptional landscape of the mammalian genome. Science 309:1559-1563, 2005
PubMed Abstract - Rat Genome Sequencing Project Consortium. Genome sequence of the Brown Norway rat yields insights into mammalian evolution. Nature 428:493-521, 2004 PubMed Abstract
- Taylor, M.S.; Ponting, C.P. and Copley, R.R. Occurrence and consequences of coding sequence insertions and deletions in Mammalian genomes. Genome Research 14:555-566, 2004 PubMed Abstract
Collaborations within the Unit
External collaborators
- Dr Nick Goldman (EMBL EBI)
- Dr Matt Sweet (Institute for Molecular Bioscience, Australia)
- Dr Kate Schroder (University of Lausanne, Switzerland)
- Dr Piero Carninci (RIKEN Omics Science Center, Japan)
- Professor Yoshihide Hayashizaki (RIKEN OMics Science Center, Japan)
Lab Members
Current lab members involved in this work are:
- Dr Martin Taylor - Group Leader
- Dr Alison Meynert - Career Development Fellow
- Dr Rob Young - Career Development Fellow (Joint with Prof Wendy
Bickmore, to start 2012) - Sarah Baker - PhD Student
- Joannah Pethick - PhD Student
- Harriet Kemp - PhD Student
- Sara Perricone - PhD Student (2nd supervisor, primary is Richard Meehan)
- Niamh Ryan - PhD Student (2nd supervisor, primary is Kathy Evans)
The big picture
The overarching objective in this research is to understand the content of the human genome. Where and how the biological functions are encoded and what the remainder of the genome is (or has been) doing.
Our studies are focussed on three complementary themes:
This is intrinsically interesting "basic research" that in several senses seeks to address a question asked at some point by every inquisitive mind: "Where do I come from?". Beyond this, our work complements laboratory based experiments to explore the regulation of gene expression and the interpretation of genetic studies in cancer and other common diseases. Such research underpins our understanding of human biology and disease processes, ultimately translating into the rational design of diagnostics and therapies for human disease.
1. Mutational Processes
Ultimately, genetic mutations provide the raw material for evolution and are responsible heritable disease. We are interested in the mechanisms that generate and repair new mutations, and perhaps more importantly what influences they have on the pattern of nucleotide sequence changes that result. These patterns can be a regional fluctuation in mutation rate or shifts in the spectrum of mutations, such as strand asymmetries or changes in the ratio of mutation transitions:transversions. Our work has already shown that mutation rate can vary at a very fine scale and can specifically correlate with DNA function (Taylor et al. 2006) and packing (Figure 1). If not accounted for, the resulting patterns can readily be misinterpreted as evidence for selection (Taylor et al. 2008), confounding our interpretation of the genome.
Figure 1. Competing models of mutation and selection to explain substitution rate periodicity arround nucleosomes (Semple and Taylor, Science 2009).
Clusters of nucleotide changes and complex mutations encompassing multiple sites are a particular interest. Generated by non-homologous recombination, gene conversion, error prone repair and probably other processes that have not yet been discovered. Again these can be misinterpreted as a signature of selection but the imperfect copying and pasting of DNA sequence could be a major force in genome evolution, enabling complex multi-site changes to occur in a single mutational step.
2. Detecting Selection
The underlying principal of comparative genomics is to use the signal of past selection as an assay for biological function in genomic sequence. Natural selection can only act on genetic variation that manifests as phenotypic differences between individual organisms of a population It is a stringent filter, even a 0.001% reduction in reproductive success will lead to a polymorphism being reliably removed from most mammalian populations (Piganeau and Eyre-Walker, 2003). We apply advanced evolutionary models and population genetic principals to understand the processes of selection that have shaped genomic sequences, allowing us to detect and interpret the encoded functions.
Evaluating the evolutionary trends of like-annotated regions (Figure 2), en mass across the genome is a strategy that lends itself well to integration with a wide diversity of functional genomic data. The general applicability of this approach to compare any meaningfully constructed sets of genome sequence makes this aspect of the work an ideal interface for collaboration with research groups based in the laboratory or clinic.
Figure 2. An overwhelming signal of purifying selection is evident in promoter regions relative to repetitive elements.
3. The genomic basis of disease
A major focus of research at the HGU and IGMM is the identification of loci and ultimately specific polymorphisms that contribute to disease risk. Current genotyping based association studies depend on just a few alleles being responsible for most of the population risk contributed by a locus. In such studies it is widely believed that comparative genomics can help with the identification of functional variation and the detection of selection add power to those analyses by removing the confounding influence of mutation rate variation. However, it is becoming increasingly apparent that much of the genetic risk for the most common human diseases such as heart disease, diabetes, cancer and mental illness lies not in common variants of individually small effect, but in rarer variants of greater effect.
We are adapting commonly used statistical frameworks from the field molecular evolution to identify genes harbouring these rare variants based on case-control and case-only sequencing projects. With some modifications we are also adapting these approaches to study the selection pressures acting during the development and progression of cancer (Talavera et al. 2010).


