Is the Optimal Set of Protein Amino Acids Purposed by a Mind?


By Fazale Rana – October 9, 2019

To get our assays to work properly, we had to carefully design and optimize each test before executing it with exacting precision in the laboratory. Optimizing these assays was no easy feat. It could take weeks of painstaking effort to get the protocols just right.

My experiences working in the lab taught me some important lessons that I carry with me today as a Christian apologist. One of these lessons has to do with optimization. Optimized systems don’t just happen, whether they are laboratory procedures, manufacturing operations, or well-designed objects or devices. Instead, optimization results from the insights and efforts of intelligent agents, and therefore serves as a sure indicator of intelligent design.

As it turns out, nearly every biochemical system appears to be highly optimized. For me, this fact indicates that life stems from a Mind. And as life scientists continue to characterize biochemical systems, they keep discovering more and more examples of biochemical optimization, as recent work by a large team of collaborators working at the Earth-Life Science Institute (ELSI) in Tokyo, Japan, illustrates.1

These researchers uncovered more evidence that the twenty amino acids encoded by the genetic code possess the optimal set of physicochemical properties. If not for these properties, it would not be possible for the cell to build proteins that could support the wide range of activities required to sustain living systems. This insight gives us important perspective into the structure-function relationships of proteins. It also has theological significance, adding to the biochemical case for a Creator.

Before describing the ELSI team’s work and its theological implications, a little background might be helpful for some readers. For those who are familiar with basic biochemistry, just skip ahead to Why These Twenty Amino Acids?

Background: Protein Structure

Proteins are large, complex molecules that play a key role in virtually all of the cell’s operations. Biochemists have long known that the three-dimensional structure of a protein dictates its function. Because proteins are such large, complex molecules, biochemists categorize protein structure into four different levels: primary, secondary, tertiary, and quaternary structures.


Figure 1: The Four Levels of Protein Structure. Image credit: Shutterstock

  • A protein’s primary structure is the linear sequence of amino acids that make up each of its polypeptide chains.
  • The secondary structure refers to short-range three-dimensional arrangements of the polypeptide chain’s backbone arising from the interactions between chemical groups that make up its backbone. Three of the most common secondary structures are the random coil, alpha (α) helix, and beta (β) pleated sheet.
  • Tertiary structure describes the overall shape of the entire polypeptide chain and the location of each of its atoms in three-dimensional space. The structure and spatial orientation of the chemical groups that extend from the protein backbone are also part of the tertiary structure.
  • Quaternary structure arises when several individual polypeptide chains interact to form a functional protein complex.

Background: Amino Acids

The building blocks of proteins are amino acids. These compounds are characterized by having both an amino group and a carboxylic acid bound to a central carbon atom. Also bound to this carbon are a hydrogen atom and a substituent that biochemists call an R group.


Figure 2: The Structure of a Typical Amino Acid. Image credit: Shutterstock

The R group determines the amino acid’s identity. For example, if the R group is hydrogen, the amino acid is called glycine. If the R group is a methyl group, the amino acid is called alanine.

Close to 150 amino acids are found in proteins. But only 19 amino acids (plus 1 imino acid, called proline) are specified by the genetic code. Biochemists refer to these 20 as the canonical set.


Figure 3: The Protein-Forming Amino Acids. Image credit: Shutterstock

A protein’s primary structure forms when amino acids react with each other to form a linear chain, with the amino group of one amino acid combining with the carboxylic acid of another to form an amide linkage. (Sometimes biochemists call the linkage a peptide bond.)


Figure 4: The Chemical Linkage between Amino Acids. Image credit: Shutterstock

The repeating amide linkages along the amino acid chain form the protein’s backbone. The amino acids’ R groups extend from the backbone, creating a distinct physicochemical profile along the protein chain for each unique amino acid sequence. To first approximation, this unique physicochemical profile dictates the protein’s higher-order structures and, hence, the protein’s function.

Why These Twenty Amino Acids?

Research has revealed that the set of amino acids used to build proteins is universal. In other words, the proteins found in every organism on Earth are made up of the same canonical set.

Biochemists have long wondered: Why these 20 amino acids?

In the early 1980s biochemists discovered that an exquisite molecular rationale undergirds the amino acid set used to make proteins.2 Every aspect of amino acid structure has to be precisely the way it is for life to be possible. On top of that, biochemists concluded that the set of 20 amino acids possesses the “just-right” physical and chemical properties that evenly and uniformly vary across a broad range of size, charge, and hydrophobicity (water resistance). In fact, it appears as if the amino acids selected for proteins seem to form a uniquely optimal set of 20 amino acids compared to random sets of amino acids.3

With these previous studies as a backdrop, the ELSI investigators wanted to develop a better understanding of the optimal nature of the universal set of amino acids used to build proteins. They also wanted to gain insight into the origin of the canonical set.

To do this they used a library of 1,913 amino acids (including the 20 amino acids that make up the canonical set) to construct random sets of amino acids. The researchers varied the set sizes from 3 to 20 amino acids and evaluated the performance of the random sets in terms of their capacity to support: (1) the folding of protein chains into three-dimensional structures; (2) protein catalytic activity; and (3) protein solubility.

They discovered that if a random set of amino acids included even a single amino acid from the canonical set, it dramatically out-performed random sets of the same size without any of the canonical amino acids. Based on these results, the researchers concluded that each of the 20 amino acids used to build proteins stands out, possessing highly unusual properties that make them ideally suited for their biochemical role, confirming the results of previous studies.

An Evolutionary Origin for the Canonical Set?

The ELSI researchers believe that—from an evolutionary standpoint—these results also shed light as to how the canonical set of amino acids emerged. Because of the unique adaptive properties of the canonical amino acids, the researchers speculate that “each time a CAA [canonical amino acid] was discovered and embedded during evolution, it provided an adaptive value unusual among many alternatives, and each selective step may have helped bootstrap the developing set to include still more CAAs.”4

In other words, the researchers offer the conjecture that whenever the evolutionary process stumbled upon one of the amino acids in the canonical set and incorporated it into nascent biochemical systems, the addition offered such a significant evolutionary advantage that it became instantiated into the biochemistry of the emerging cellular systems. Presumably, as this selection process occurred repeatedly over time, members of the canonical set would be added, one by one, to the evolving amino acid set, eventually culminating in the full canonical set.

Scientists find further support for this scenario in the following observation: some of the canonical amino acids seemingly play a more important role in optimizing smaller sets of amino acids, some play a more important role in optimizing intermediate size sets of amino acids, and others play a more prominent role in optimizing larger sets. They argue that this difference may reflect the sequence by which amino acids were added to the evolving set of amino acids as life emerged.

On the surface, this evolutionary explanation is not unreasonable. But more careful consideration of the idea raises concerns. For example, just because a canonical amino acid becomes incorporated into a set of amino acids and improves its adaptive value doesn’t mean that the resulting set of amino acids could produce the range of proteins with the solubility, foldability, and catalytic range needed to support life processes. Intuitively, it seems to me as a biochemist, that there must be a threshold for the number of canonical amino acids in any set of amino acids for it to have the range of physicochemical properties needed to build all the proteins needed to support minimal life.

I also question this evolutionary scenario because some of the amino acids that optimize smaller sets would not have been the ones present initially on the early Earth because they cannot be made by prebiotic reactions. Instead, many of the amino acids that optimize smaller sets can only be generated through biosynthetic routes that must have emerged much later in any evolutionary scenario for the origin of life.5 This limitation also means that the only way for some of the canonical amino acids to become incorporated into the canonical set is that multi-step biosynthetic routes for those amino acids evolved first. But if the full canonical set isn’t available, then it is questionable if the proteins needed to catalyze the biosynthesis of these amino acid would exist, resulting in a chicken-and-egg dilemma.

In light of these concerns, is there a better explanation for the highly optimized canonical set of amino acids?

A Creator’s Role?

Optimality of the universal set of protein amino acids finds explanation if life stems from a Creator’s handiwork. As noted, optimization is an indicator of intelligent design, achieved through foresight and preplanning. Optimization requires inordinate attention to detail and careful craftsmanship. By analogy, the optimized biochemistry epitomized by the amino acid set that makes up proteins rationally points to the work of a Creator.

Is There a Biochemical Anthropic Principle?

This discovery also leads to another philosophical implication: It lends support to the existence of a biochemical anthropic principle.

The ELSI researchers speculate that no matter the starting point in the evolutionary process, the pathways will all converge at the canonical set of amino acids because of the acids’ unusual adaptive properties. In other words, the amino acids that make up the universal set of protein-coding amino acids are not the outworking of an historically contingent evolutionary process, but instead seem to be fundamentally prescribed by the laws of nature. To put it differently, it appears as if the canonical set of amino acids has been preordained in some way.6 One of the study’s authors, Rudrarup Bose, suggests that “Life may not be just a set of accidental events. Rather, there may be some universal laws governing the evolution of life.”7

Though I prefer to see the origin of life as a creation event, it is important to recognize that even if one were to adopt an evolutionary perspective on life’s origin, it looks as if a Mind is responsible for jimmy-rigging the process to a predetermined endpoint. It looks as if a Mind purposed for life to be present in the universe and structured the laws of nature so that, in this case, the uniquely optimal canonical set of amino acids would inevitably emerge.

Along these lines, it is remarkable to think that the canonical set of amino acids has the precise properties needed for life to exist. This “coincidence” is eerie, to say the least. As a biochemist, I interpret this coincidence as evidence that our universe has been designed for a purpose. It is provocative to think that regardless of one’s perspective on the origin of life, the evidence converges toward a single conclusion: namely that life manifests from an intelligent agent—God.


The Optimality of Biochemical Systems

The Biochemical Anthropic Principle

  1. Melissa Ilardo et al., “Adaptive Properties of the Genetically Encoded Amino Acid Alphabet Are Inherited from Its Subset,” Scientific Reports 9, no. 12468 (August 28, 2019), doi:10.1038/s41598-019-47574-x.
  2. Arthur L. Weber and Stanley L. Miller, “Reasons for the Occurrence of the Twenty Coded Protein Amino Acids,” Journal of Molecular Evolution 17, no. 5 (September 1981): 273–84, doi:10.1007/BF01795749; H. James Cleaves II, “The Origin of the Biologically Coded Amino Acids,” Journal of Theoretical Biology 263, no. 4 (April 2010): 490–98, doi:10.1016/j.jtbi.2009.12.014.
  3. Gayle K. Philip and Stephen J. Freeland, “Did Evolution Select a Nonrandom ‘Alphabet’ of Amino Acids?” Astrobiology 11, no. 3 (April 2011), 235–40, doi:10.1089/ast.2010.0567; Matthias Granhold et al., “Modern Diversification of the Amino Acid Repertoire Driven by Oxygen,” Proceedings of the National Academy of Sciences, USA 115, no. 1 (January 2, 2018): 41–46, doi:10.1073/pnas.1717100115.
  4. Ilardo et al., “Adaptive Properties.”
  5. J. Tze-Fei Wong and Patricia M. Bronskill, “Inadequacy of Prebiotic Synthesis as Origin of Proteinous Amino Acids,” Journal of Molecular Evolution 13, no. 2 (June 1979): 115–25, doi:10.1007/BF01732867.
  6. Tokyo Institute of Technology, “Scientists Find Biology’s Optimal ‘Molecular Alphabet’ May Be Preordained,” ScienceDaily, September 10, 2019,
  7. Tokyo Institute, “Scientists Find.”

Reprinted with permission by the author

Original article at:

Simple Biological Rules Affirm Creation


By Fazale Rana – September 4, 2019

“Biology is the study of complicated things that give the appearance of having been designed for a purpose.”
–Richard Dawkins

To say that biological systems are complicated is an understatement.

When I was in college, I had some friends who avoided taking courses in the life sciences because of the complexity of biological systems. On the other hand, I found the complexity alluring. It’s what drew me to biochemistry. I love to immerse myself in the seemingly never-ending intricacies of biomolecular systems and try to make sense of them.

Perhaps nothing exemplifies the daunting complexity of biochemistry more than intermediary metabolism.

Order in the Midst of Biochemical Complexity

I remember a conversation I had years ago with a first-year graduate student who worked in the same lab as me when I was a postdoc at the University of Virginia. He was complaining about all the memorization he had to do for the course he was taking on intermediary metabolism. How else was he going to become conversant with all the different metabolic routes in the cell?

I told him that he was approaching his classwork in the wrong way. Despite the complexity and chemical diversity of the metabolic pathways in the cell, a set of principles exists that dictates the architecture and operation of metabolic routes. I encouraged my lab mate to learn these principles because, once he did, he would be able to use them to write out all of the metabolic routes with minimal memorization.

These principles make sense of the complexity of intermediary metabolism. Are there similar rules that make sense of biological diversity and complexity?

Rules Govern Biological Systems

As it turns out, the insight I offered my lab mate may well have been prescient.

The idea that a simple set of principles—rules, if you will—accounts for the complexity and diversity of biological systems may be more widespread than life scientists fully appreciate. At least it appears this way based on work carried out recently by researchers from Duke University.1 These investigators discovered a simple rule that predicts the behavior of mutually beneficial symbiotic relationships (mutualism) in ecosystems. Mutualistic interactions play an important and dominant role in ecosystem stability.

The Duke University scientists’ accomplishment represents a significant milestone. Lingchong You, one of the study’s authors, points out the difficulty of finding rules that govern all biological systems:

“In a perfect world, you’d be able to follow a simple set of molecular rules to understand how every biological system operated. But, in reality, it’s difficult to establish rules that encompass the immense diversity and complexity of biological systems. Even when we do establish general rules, it’s still challenging to use them to explain and quantify various physical properties.”2

Yet, You and his collaborators have done just that for mutualism. Their insight moves biology closer to physics and chemistry where simple rules can account for the physical world. Their work holds the potential to open up new vistas in the life sciences that can lead to a deeper, more fundamental understanding of biological systems.

In fact, the researchers think that simple rules dictating the operation of biological systems may not be an unusual feature of mutualistic interactions but may apply more broadly. They write, “Beyond establishing another simple rule . . . we also demonstrated that one can purposefully seek an appropriate abstraction level where a simple unifying rule emerges over system diversity.”3

If the Duke University scientists’ insight generally applies to biological systems, it has interesting theological implications. If biological systems do, indeed, conform to a simple set of rules, it becomes more reasonable to think that a Creator played a role in the origin, history, and design of life.

I’ll explain how in a moment, but first let’s take a look at some details of the Duke University investigators’ work.

Mutualism and Ecosystem Stability

Biological organisms often form symbiotic relationships. When these relationships benefit all of the organisms involved, it is called mutualism. These mutualistic relationships are vital to ecosystems and they directly and indirectly benefit humanity. For example, coral reefs depend on mutualistic interactions between coral and algae. In turn, reefs provide habitats for a diverse ensemble of organisms that support human life and flourishing.

Unfortunately, mutualistic systems can collapse when one or more of the partners experiences stress or disappears from the ecosystem. A disruption in a relationship can lead to the loss of other members of the ecosystem, thereby altering the ecosystem’s composition and opening up niches for invading organisms. Sadly, this type of collapse is happening in coral reefs around the world today.

Mutualism Can Be Explained by a Simple Rule

To gain insight into the rules that dictate ecosystem stability and predict collapse (due to a loss of mutualistic relationships), the Duke University researchers sought to develop a framework that would allow them to determine the outcome of mutualistic interactions. For the predictive framework, the scientists wrote 52 mathematical equations, each one specifically describing one of the various forms of mutualism. These equations were based on a simple biological logic; namely, mutualism consists of two or more populations of organisms that produce a benefit (B) for all the organisms that reduces the stress (S) they experience at a cost (C).

Mathematical analysis of these equations allowed the researchers to discover a simple inequality that governs the transition from coexistence to collapse. As it turns out, mutualistic interactions remain stable when B > S, and they collapse when this inequality is not observed. Though intuitive, it is still remarkable that this simple relationship dictates the behavior of all types of mutualism.

The researchers learned that determining the value of S is relatively straightforward. On the other hand, quantifying B proves to be a challenge due to the large number of variables such as temperature, nutrient availability, genetic variation, etc., that influence mutualistic interactions. To work around this problem, the researchers developed a machine-learning algorithm that could calculate B using the input of a large number of variables.

This work has obvious importance for ecologists as ecosystems all over the planet face collapse. Beyond that, it has important theological implications when we recognize that a simple mathematical equation governs the behavior of mutualistic relationships among organisms.

Let me explain.

The Case for a Creator

From my vantage point, one of the most intriguing aspects of our universe is its intelligibility and our capacity as human beings to make sense of the world around us—quite often, through the use of simple rules we have discovered. Along these lines, it is even more remarkable that the universe and its phenomena can be described using mathematical relationships, which reflects an underlying rationale to the universe itself.

For most of the history of science, the discovery and exploration of the mathematical nature of the universe has been confined to physics and, to a lesser extent, chemistry. Because of the complexity and diversity of biological systems, many people working in the life sciences have questioned if simple mathematical rules exist in biology and could ever be discovered.

But the discovery of a simple rule that predicts the behavior of mutualistic relationships in ecosystems suggests that mathematical relationships do describe and govern biological phenomena. And, as the researchers point out, their discovery may turn out to be the rule rather than the exception.

From my perspective, a universe governed by mathematical relationships suggests that a deep, underlying rationale undergirds nature, which is precisely what I would expect if a Mind was behind the universe. To put it differently, if a Creator was responsible for the universe, as a Christian, I would expect that mathematical relationships would define the universe’s structure and function. In like manner, if the origin and design of living systems originated from a Creator, it would make sense that biological systems would possess an underlying mathematical structure as well—though it might be hard for us to discern these relationships because of the systems’ complexity.


Figure: The Mathematical Universe. Image credit: Shutterstock.

The mathematical structure of the universe—and maybe even of biology—makes the world around us intelligible. And intelligibility is precisely what we would expect if the universe and everything in it were the products of a Creator—one who desired to make himself known to us through the creation (Romans 1:20). It is also what we would expect if human beings were made in God’s image (as Scripture describes), with the capacity to discern God’s handiwork in the world around us.

A Case against Materialism

But what if humans—including our minds—were cobbled together by evolutionary processes? Why would we expect human beings to be capable of making sense of the world around us? For that matter, why would we expect the universe—including the biological realm—to adhere to mathematical relationships?

In other words, the mathematical undergirding of nature fits better in a theistic conception of reality than one rooted in materialism. And toward that end, the discovery by the Duke University investigators points to God’s role in the origin and design of life.

Is There a Biological Anthropic Principle?

As the Duke University scientists show, the discovery of a simple mathematical relationship describing the behavior of mutualistic interactions in ecosystems suggests that these types of relationships may be more commonplace than most life scientists thought or imagined. (See Biochemical Anthropic Principle in the Resources section.)

This discovery also suggests that a cornerstone feature of ecosystems—mutualistic relationships—is not the haphazard product of evolutionary history. Instead, scientists observe a process fundamentally dictated and constrained by the laws of nature as revealed in the simple mathematical rule that describes the behavior of these systems. We can infer that mutualism within ecosystems may not be the outworking of chance events—the consequence of a historically contingent evolutionary process. Rather, these relationships appear to be fundamentally prescribed by the design of the universe. In other words, mutualism in ecosystems is inevitable in a universe like ours.

For me, it is eerie to think that mutualism, which appears to be specified by the laws of nature, is precisely what is needed to maintain stable ecosystems. The universe appears to be structured in a just-right way so that stable ecosystems result. If the universe was any other way, then mutualism wouldn’t exist nor would ecosystems.

One way to interpret this “coincidence” is to view it as evidence that our universe has been designed for a purpose. And purpose must come from a Mind—namely, God.


The Argument from Math and Beauty

Designed for Discovery

The Biochemical Anthropic Principle

The Design of Intermediary Metabolism

  1. Feilun Wu et al., “A Unifying Framework for Interpreting and Predicting Mutualistic Systems, Nature Communications 10 (2019): 242, doi:/10.1038/s41467-018-08188-5.
  2. Duke University, “Simple Rules Predict and Explain Biological Mutualism,” ScienceDaily (January 16, 2019),
  3. Wu et al., “A Unifying Framework.”

Reprinted with permission by the author

Original article at:

Biochemical Grammar Communicates the Case for Creation

Untitled 19

As I get older, I find myself forgetting things—a lot. But, thanks to smartphone technology, I have learned how to manage my forgetfulness by using the “Notes” app on my iPhone.


Figure 1: The Apple Notes app icon. Image credit: Wikipedia

This app makes it easy for me to:

  • Jot down ideas that suddenly come to me
  • List books I want to read and websites I want to visit
  • Make note of musical artists I want to check out
  • Record “to do” and grocery lists
  • Write down details I need to have at my fingertips when I travel
  • List new scientific discoveries with implications for the RTB creation model that I want to blog about, such as the recent discovery of a protein grammar calling attention to the elegant design of biochemical systems

And the list goes on. I will never forget, again!

On top of that, I can use the Notes app to categorize and organize all my notes and house them in a single location. Thus, I don’t have to manage scraps of paper that invariably wind up getting scattered all over the place—and often lost.

And, as a bonus, the Notes app anticipates the next word I am going to use even before I type it. I find myself relying on this feature more and more. It is much easier to select a word than type it out. In fact, the more I use this feature, the better the app becomes at anticipating the next word I want to type.

Recently, a team of bioinformaticists from the University of Alabama, Birmingham (UAB) and the National Institutes of Health (NIH) used the same algorithm the Notes app uses to anticipate word usage to study protein architectures.1 Their analysis reveals new insight into the structural features of proteins and also highlights the analogy between the information housed in these biomolecules and human language. This analogy contributes to the revitalized Watchmaker argument presented in my book The Cell’s Design.

N-Gram Language Modeling

The algorithm used by the Notes app to anticipate the next word the user will likely type is called n-gram language modeling. This algorithm determines the probability of a word being used based on the previous word (or words) typed. (If the probability is based on a single word, it is called a unigram probability. If the calculation is based on the previous two words, it is called a bigram probability, and so on.) This algorithm “trains” the Notes app so that the more I use it, the more reliable the calculated probabilities—and, hence, the better the word recommendations.

N-Gram Language Modeling and the Case for a Creator

To understand why the work of research team from UAB and NIH provides evidence for a Creator’s role in the origin and design of life, a brief review of protein structure is in order.

Protein Structure

Proteins are large complex molecules that play a key role in virtually all of the cell’s operations. Biochemists have long known that the three-dimensional structure of a protein dictates its function.

Because proteins are such large complex molecules, biochemists categorize protein structure into four different levels: primary, secondary, tertiary, and quaternary structures. A protein’s primary structure is the linear sequence of amino acids that make up each of its polypeptide chains.

The secondary structure refers to short-range three-dimensional arrangements of the polypeptide chain’s backbone arising from the interactions between chemical groups that make up its backbone. Three of the most common secondary structures are the random coil, alpha (α) helix, and beta (β) pleated sheet.

Tertiary structure describes the overall shape of the entire polypeptide chain and the location of each of its atoms in three-dimensional space. The structure and spatial orientation of the chemical groups that extend from the protein backbone are also part of the tertiary structure.

Quaternary structure arises when several individual polypeptide chains interact to form a functional protein complex.



Figure 2: The four levels of protein structure. Image credit: Shutterstock

Protein Domains

Within the tertiary structure of proteins, biochemists have discovered compact, self-contained regions that fold independently. These three-dimensional regions of the protein’s structure are called domains. Some proteins consist of a single compact domain, but many proteins possess several domains. In effect, domains can be thought to be the fundamental units of a protein’s tertiary structure. Each domain possesses a unique biochemical function. Biochemists refer to the spatial arrangement of domains as a protein’s domain architecture.

Researchers have discovered several thousand distinct protein domains. Many of these domains recur in different proteins, with each protein’s tertiary structure comprised of a mix-and-match combination of protein domains. Biochemists have also learned that a relationship exists between the complexity of an organism and the number of unique domains found in its set of proteins and the number of multi-domain proteins encoded by its genome.


Figure 3: Pyruvate kinase, an example of a protein with three domains. Image credit: Wikipedia

The Key Question in Protein Chemistry

As much progress as biochemists have made characterizing protein structure over the last several decades, they still lack a fundamental understanding of the relationship between primary structure (the amino acid sequence) and tertiary structure and, hence, protein function. In order to develop this insight, they need to determine the “rules” that dictate the way proteins fold. Treating proteins as information systems can help determine some of these rules.

Protein as Information Systems

Proteins are not only large, complex molecules but also information-harboring systems. The amino acid sequence that defines a protein’s primary structure is a type of information—biochemical information—with the individual amino acids analogous to the letters that make up an alphabet.

N-Gram Analysis of Proteins

To gain insight into the relationship between a protein’s primary structure and its tertiary structures, the researchers from UAB and NIH carried out an n-gram analysis on the 23 million protein domains found in the protein sets of 4,800 species found across all three domains of life.

These researchers point out that an individual amino acid in a protein’s primary structure doesn’t contain information just as an individual letter in an alphabet doesn’t harbor any meaning. In human language, the most basic unit that conveys meaning is a word. And, in proteins, the most basic unit that conveys biochemical meaning is a domain.

To decipher the “grammar” used by proteins, the researchers treated adjacent pairs of protein domains in the tertiary structure of each protein in the sample set as a bigram (similar to two words together). Surveying the proteins found in their data set of 4,800 species, they discovered that 95% of all the possible domain combinations don’t exist!

This finding is key. It indicates that there are, indeed, rules that dictate the way domains interact. In other words, just like certain word combinations never occur in human languages because of the rules of grammar, there appears to be a protein “grammar” that constrains the domain combinations in proteins. This insight implies that physicochemical constraints (which define protein grammar) dictate a protein’s tertiary structure, preventing 95% of conceivable domain-domain interactions.

Entropy of Protein Grammar

In thermodynamics, entropy is often used as a measure of the disorder of a system. Information theorists borrow the concept of entropy and use it to measure the information content of a system. For information theorists, the entropy of a system is indirectly proportional to the amount of information contained in a sequence of symbols. As the information content increases, the entropy of the sequence decreases, and vice versa. Using this concept, the UAB and NIH researchers calculated the entropy of the protein domain combinations.

In human language, the entropy increases as the vocabulary increases. This makes sense because, as the number of words increases in a language, the likelihood that random word combinations would harbor meaning decreases. In like manner, the research team discovered that the entropy of the protein grammar increases as the number of domains increases. (This increase in entropy likely reflects the physicochemical constraints—the protein grammar, if you will—on domain interactions.)

Human languages all carry the same amount of information. That is to say, they all display the same entropy content. Information theorists interpret this observation as an indication that a universal grammar undergirds all human languages. It is intriguing that the researchers discovered that the protein “languages” across prokaryotes and eukaryotes all display the same level of entropy and, consequently, the same information content. This relationship holds despite the diversity and differences in complexity of the organism in their data set. By analogy, this finding indicates that a universal grammar exists for proteins. Or to put it another way, the same set of physicochemical constraints dictate the way protein domains interact for all organisms.

At this point, the researchers don’t know what the grammatical rules are for proteins, but knowing that they exist paves the way for future studies. It also generates hope that one day biochemists might understand them and, in turn, use them to predict protein structure from amino acid sequences.

This study also illustrates how fruitful it can be to treat biochemical systems as information systems. The researchers conclude that “The similarities between natural languages and genomes are apparent when domains are treated as functional analogs of words in natural languages.”2

In my view, it is this relationship that points to a Creator’s role in the origin and design of life.

Protein Grammar and the Case for a Creator

As discussed in The Cell’s Design, the recognition that biochemical systems are information-based systems has interesting philosophical ramifications. Common, everyday experience teaches that information derives solely from the activity of human beings. So, by analogy, biochemical information systems, too, should come from a divine Mind. Or at least it is rational to hold that view.

But the case for a Creator strengthens when we recognize that it’s not merely the presence of information in biomolecules that contributes to this version of a revitalized Watchmaker analogy. Added vigor comes from the UAB and NIH researchers’ discovery that the mathematical structure of human languages and biochemical languages is identical.

Skeptics often dismiss the updated Watchmaker argument by arguing that biochemical information is not genuine information. Instead, they maintain that when scientists refer to biomolecules as harboring information, they are employing an illustrative analogy—a scientific metaphor—and nothing more. They accuse creationists and intelligent design proponents of misconstruing their use of analogical language to make the case for design.3

But the UAB and NIH scientists’ work questions the validity of this objection. Biochemical information has all of the properties of human language. It really is information, just like the information we conceive and use to communicate.

Is There a Biochemical Anthropic Principle?

This discovery also yields another interesting philosophical implication. It lends support to the existence of a biochemical anthropic principle. Discovery of a protein grammar means that there are physicochemical constraints on protein structure. It is remarkable to think that protein tertiary structures may be fundamentally dictated by the laws of nature, instead of being the outworking of an historically contingent evolutionary history. To put it differently, the discovery of a protein grammar reveals that the structure of biological systems may reflect some deep, underlying principles that arise from the very nature of the universe itself. And yet these structures are precisely the types of structures life needs to exist.

I interpret this “coincidence” as evidence that our universe has been designed for a purpose. And as a Christian, I find that notion to resonate powerfully with the idea that life manifests from an intelligent Agent—namely, God.

Resources to Dig Deeper

  1. Lijia Yu et al., “Grammar of Protein Domain Architectures,” Proceedings of the National Academy of Sciences, USA 116, no. 9 (February 26, 2019): 3636–45, doi:10.1073/pnas.1814684116.
  2. Yu et al., 3636–45.
  3. For example, see Massimo Pigliucci and Maarten Boudry, “Why Machine-Information Metaphors Are Bad for Science and Science Education,” Science and Education 20, no. 5–6 (May 2011): 453–71; doi:10.1007/s11191-010-9267-6.

Reprinted with permission by the author
Original article at: