Causal Foundations of Biological Information

Big Question: Is biological information a substantive causal factor in living systems?

What is the source of order and purpose in living systems? This has been the key question at the boundary of biology and philosophy since the eighteenth century. Today most people would answer that organisms are driven by information, much of which has accumulated during evolution, and much of which is genetically transmitted. But this idea is surprisingly hard to cash out in terms of actual science, and many leading philosophers of biology have suggested that ‘biological information’ in this sense is merely a metaphor.

We set out to rectify this situation by providing a rigorous, general account of biological information as a causal factor in the operation of living systems. Our approach was inspired by the co-discoverer of DNA, Francis Crick, who suggested that genetic information is the precise determination of the structure of gene products. We developed a measure of precise determination by combining classical information theory and causal graph theory. The measure of ‘Crick information’ that we devised is an example of a broader set of ‘causal information theory’ measures that are starting to be used in complex systems science. It is also a measure of biological specificity, one of the key organising concepts of 20th century biology.

Our measure is a substantial step towards a general theory of biological information. It provides a common measure for the contribution to the structure of a gene product made by both coding and non-coding sequences of DNA, by epigenetic modifications of DNA, and by the genetic or environmental causes of those epigenetic modifications. It allows us to compare the relative importance of these factors in the production of a single biomolecule. This allowed us to restate more clearly many claims about the role of epigenetics and environment in development and evolution.

We have used our measure of information as precise determination to address a range of specific biological questions. First, we have taken existing models of the evolution of signals in biological networks that use classical information theory and shown that they can be improved by using a measure of causal information. Second, we have used our measure to restate in precise, quantitative terms the distinction between permissive and instructive causes in developmental biology. This work created a fully implemented computer model that restates the ideas embodied in C.H Waddington’s famous ‘developmental landscape’ in a simpler and more tractable form. Finally, we used our measure to model the evolution of ‘master control genes’ in gene regulatory networks. This work suggests that a measure of causal information is more appropriate for analysing the structure of gene regulatory networks than the topological measures that are standardly used. There is great potential to take this work further.

Our measure can in principle be applied to other phenotypes, but in practice this would require a great deal more data than is available, and it also runs up against the technical limitation that information theory can only be applied to discrete variables. We showed in principle that the same, fundamental approach to measuring causation could be extended to continuous variables, something that is a substantial contribution to the philosophy of causation. We made other significant contributions of this kind, taking a number of contested, qualitative ideas about causation, such as the ‘stability’ of a causal relationship, and restating them in precise terms.

Another aspect of Crick’s ideas about information was that it can be ‘transferred’ from one molecule to another. This idea can be explored using another branch of information theory, algorithmic information theory. We showed that to develop an actual measure of this aspect of biological information it would be necessary to devise a ‘language of the cell’ which would capture the complexity of a problem for a cell, rather than for a human programmer. This idea raises exciting prospects for future research.

Our work has been presented and published in both philosophical and biological venues and some of our results can be put to immediate work in biology.