The 64 question that’s central to life
The structure of DNA indicates that every species on Earth is descended from a single common ancestor. But what if there are exceptions? Paul Davies explores in his latest column.
Since ancient Greece it has been appreciated that pattern and form pervade the living world, from the arrangements of leaves to the spiral shapes of shells. But it was only in the mid-twentieth century that scientists discovered mathematics at the very core of life.
In 1952 Francis Crick and James Watson discovered the famous double helix structure of DNA, in which the instructions for all known life are inscribed. The vital information is stored as sequences of four molecules, called bases, best-known by their letters A, C, G and T. Rather than words, the letters spell out a mathematical code.
A key function of DNA is to specify the manufacture of proteins, the workhorses of biology. Proteins are made from small molecules, amino acids, strung end to end, typically several hundred in total. There are dozens of possible types of amino acids, but life uses only 20. The properties of a given protein depend on the precise sequence of amino acids that form it, and that sequence is specified by a segment of DNA. Translation between the four-letter alphabet of DNA and the 20-letter alphabet of proteins requires a code. Known simply as the genetic code, it was cracked in the 1960s.
If life used only four amino acids the arithmetic would be simple: each base could stand for a different protein. But to accommodate 20, the bases are grouped into triplets: ACT, GCA and so on. Each triplet is called a ‘codon’. There are 64 possible codons, which is more than enough to specify 20 amino acids. As a result, there are plenty of spare codons. Some are used for punctuation, but the rest code for the same 20 amino acids, implying a lot of redundancy. For example, the amino acid arginine is specified by six different codons: CGT, CGC, CGA, CGG, AGG, and AGA.
One of the most striking features of life on Earth is that the genetic code used by all known organisms is strikingly similar, with few differences across species, implying it was used by the common ancestor of all life, billions of years ago. Given the combinatorial possibilities of specifying 20 amino acids from 64 codons, there are more potential codes than there are atoms in the universe, which prompts the question of whether the actual codes in use are special in some way.
Might there still exist microorganisms using an earlier, simpler version?
Look again at the list of six codons that specify arginine – each differs from its neighbours by a single letter. And the other two codons starting with AG encode serine, an amino acid chemically similar to arginine. So if a mistake occurs in translation the protein is still likely to end up with arginine or a minimally disruptive substitution.
Looking across all the protein families, the code we ended up with seems to be particularly forgiving of mistakes.
If the code is optimised for robustness, then natural selection must have chosen it among less efficient competitors. And that implies it hasn’t always been fixed, but co-evolved with primitive life. Which raises a fascinating possibility. Might there still exist microorganisms using an earlier, simpler version?
Such ‘living fossils’ could easily have been overlooked because searches for novel microbes are customised to spot the code as we know it and would not pick out any ‘aliens’. If a primitive form of life restricted its protein components to, say, 10 rather than 20 amino acids, then a doublet code involving pairs rather than triplets of bases would suffice, providing 16 codon possibilities. Finding such organisms would be the biggest advance in biology since Darwin.
An even more intriguing idea is that the code contains hidden mathematical patterns.
Peter Jarvis of the University of Tasmania presented evidence some years ago that a property known as supersymmetry – a particle physics concept for unifying particles of matter, like electrons, with particles that convey nature’s forces, such as photons – lies buried in the arrangement of the 64 coding assignments. Why such a pattern would be embedded in the code of life is a complete mystery.
Because the origin of the code, along with that of life, is lost, there is plenty of scope for speculation. Is there a code within the code, pointing to deep organisational principles yet to be uncovered?
Is 64 merely the tip of a numerical iceberg concealing a web of mathematical subtleties? The Greek philosophers would surely have agreed: to paraphrase Pythagoras, number is within all living things.