Is 75% of the Human Genome Junk DNA?

is75percentofthehumangenome
BY FAZALE RANA – AUGUST 29, 2017

By the rude bridge that arched the flood,
Their flag to April’s breeze unfurled,
Here once the embattled farmers stood,
And fired the shot heard round the world.

–Ralph Waldo Emerson, Concord Hymn

Emerson referred to the Battles of Lexington and Concord, the first skirmishes of the Revolutionary War, as the “shot heard round the world.”

While not as loud as the gunfire that triggered the Revolutionary War, a recent article published in Genome Biology and Evolution by evolutionary biologist Dan Graur has garnered a lot of attention,1 serving as the latest salvo in the junk DNA wars—a conflict between genomics scientists and evolutionary biologists about the amount of functional DNA sequences in the human genome.

Clearly, this conflict has important scientific ramifications, as researchers strive to understand the human genome and seek to identify the genetic basis for diseases. The functional content of the human genome also has significant implications for creation-evolution skirmishes. If most of the human genome turns out to be junk after all, then the case for a Creator potentially suffers collateral damage.

According to Graur, no more than 25% of the human genome is functional—a much lower percentage than reported by the ENCODE Consortium. Released in September 2012, phase II results of the ENCODE project indicated that 80% of the human genome is functional, with the expectation that the percentage of functional DNA in the genome would rise toward 100% when phase III of the project reached completion.

If true, Graur’s claim would represent a serious blow to the validity of the ENCODE project conclusions and devastate the RTB human origins creation model. Intelligent design proponents and creationists (like me) have heralded the results of the ENCODE project as critical in our response to the junk DNA challenge.

Junk DNA and the Creation vs. Evolution Battle

Evolutionary biologists have long considered the presence of junk DNA in genomes as one of the most potent pieces of evidence for biological evolution. Skeptics ask, “Why would a Creator purposely introduce identical nonfunctional DNA sequences at the same locations in the genomes of different, though seemingly related, organisms?”

When the draft sequence was first published in 2000, researchers thought only around 2–5% of the human genome consisted of functional sequences, with the rest being junk. Numerous skeptics and evolutionary biologists claim that such a vast amount of junk DNA in the human genome is compelling evidence for evolution and the most potent challenge against intelligent design/creationism.

But these arguments evaporate in the wake of the ENCODE project. If valid, the ENCODE results would radically alter our view of the human genome. No longer could the human genome be regarded as a wasteland of junk; rather, the human genome would have to be recognized as an elegantly designed system that displays sophistication far beyond what most evolutionary biologists ever imagined.

ENCODE Skeptics

The findings of the ENCODE project have been criticized by some evolutionary biologists who have cited several technical problems with the study design and the interpretation of the results. (See articles listed under “Resources to Go Deeper” for a detailed description of these complaints and my responses.) But ultimately, their criticisms appear to be motivated by an overarching concern: if the ENCODE results stand, then it means key features of the evolutionary paradigm can’t be correct.

Calculating the Percentage of Functional DNA in the Human Genome

Graur (perhaps the foremost critic of the ENCODE project) has tried to discredit the ENCODE findings by demonstrating that they are incompatible with evolutionary theory. Toward this end, he has developed a mathematical model to calculate the percentage of functional DNA in the human genome based on mutational load—the amount of deleterious mutations harbored by the human genome.

Graur argues that junk DNA functions as a “sponge” absorbing deleterious mutations, thereby protecting functional regions of the genome. Considering this buffering effect, Graur wanted to know how much junk DNA must exist in the human genome to buffer against the loss of fitness—which would result from deleterious mutations in functional DNA—so that a constant population size can be maintained.

Historically, the replacement level fertility rates for human beings have been two to three children per couple. Based on Graur’s modeling, this fertility rate requires 85–90% of the human genome to be composed of junk DNA in order to absorb deleterious mutations—ensuring a constant population size, with the upper limit of functional DNA capped at 25%.

Graur also calculated a fertility rate of 15 children per couple, at minimum, to maintain a constant population size, assuming 80% of the human genome is functional. According to Graur’s calculations, if 100% of the human genome displayed function, the minimum replacement level fertility rate would have to be 24 children per couple.

He argues that both conclusions are unreasonable. On this basis, therefore, he concludes that the ENCODE results cannot be correct.

Response to Graur

So, has Graur’s work invalidated the ENCODE project results? Hardly. Here are four reasons why I’m skeptical.

1. Graur’s estimate of the functional content of the human genome is based on mathematical modeling, not experimental results.

An adage I heard repeatedly in graduate school applies: “Theories guide, experiments decide.” Though the ENCODE project results theoretically don’t make sense in light of the evolutionary paradigm, that is not a reason to consider them invalid. A growing number of studies provide independent experimental validation of the ENCODE conclusions. (Go here and here for two recent examples.)

To question experimental results because they don’t align with a theory’s predictions is a “Bizarro World” approach to science. Experimental results and observations determine a theory’s validity, not the other way around. Yet when it comes to the ENCODE project, its conclusions seem to be weighed based on their conformity to evolutionary theory. Simply put, ENCODE skeptics are doing science backwards.

While Graur and other evolutionary biologists argue that the ENCODE results don’t make sense from an evolutionary standpoint, I would argue as a biochemist that the high percentage of functional regions in the human genome makes perfect sense. The ENCODE project determined that a significant fraction of the human genome is transcribed. They also measured high levels of protein binding.

ENCODE skeptics argue that this biochemical activity is merely biochemical noise. But this assertion does not make sense because (1) biochemical noise costs energy and (2) random interactions between proteins and the genome would be harmful to the organism.

Transcription is an energy- and resource-intensive process. To believe that most transcripts are merely biochemical noise would be untenable. Such a view ignores cellular energetics. Transcribing a large percentage of the genome when most of the transcripts serve no useful function would routinely waste a significant amount of the organism’s energy and material stores. If such an inefficient practice existed, surely natural selection would eliminate it and streamline transcription to produce transcripts that contribute to the organism’s fitness.

Apart from energetics considerations, this argument ignores the fact that random protein binding would make a dire mess of genome operations. Without minimizing these disruptive interactions, biochemical processes in the cell would grind to a halt. It is reasonable to think that the same considerations would apply to transcription factor binding with DNA.

2. Graur’s model employs some questionable assumptions.

Graur uses an unrealistically high rate for deleterious mutations in his calculations.

Graur determined the deleterious mutation rate using protein-coding genes. These DNA sequences are highly sensitive to mutations. In contrast, other regions of the genome that display function—such as those that (1) dictate the three-dimensional structure of chromosomes, (2) serve as transcription factors, and (3) aid as histone binding sites—are much more tolerant to mutations. Ignoring these sequences in the modeling work artificially increases the amount of required junk DNA to maintain a constant population size.

3. The way Graur determines if DNA sequence elements are functional is questionable. 

Graur uses the selected-effect definition of function. According to this definition, a DNA sequence is only functional if it is undergoing negative selection. In other words, sequences in genomes can be deemed functional only if they evolved under evolutionary processes to perform a particular function. Once evolved, these sequences, if they are functional, will resist evolutionary change (due to natural selection) because any alteration would compromise the function of the sequence and endanger the organism. If deleterious, the sequence variations would be eliminated from the population due to the reduced survivability and reproductive success of organisms possessing those variants. Hence, functional sequences are those under the effects of selection.

In contrast, the ENCODE project employed a causal definition of function. Accordingly, function is ascribed to sequences that play some observationally or experimentally determined role in genome structure and/or function.

The ENCODE project focused on experimentally determining which sequences in the human genome displayed biochemical activity using assays that measured

  • transcription,
  • binding of transcription factors to DNA,
  • histone binding to DNA,
  • DNA binding by modified histones,
  • DNA methylation, and
  • three-dimensional interactions between enhancer sequences and genes.

In other words, if a sequence is involved in any of these processes—all of which play well-established roles in gene regulation—then the sequences must have functional utility. That is, if sequenceQperforms functionG, then sequenceQis functional.

So why does Graur insist on a selected-effect definition of function? For no other reason than a causal definition ignores the evolutionary framework when determining function. He insists that function be defined exclusively within the context of the evolutionary paradigm. In other words, his preference for defining function has more to do with philosophical concerns than scientific ones—and with a deep-seated commitment to the evolutionary paradigm.

As a biochemist, I am troubled by the selected-effect definition of function because it is theory-dependent. In science, cause-and-effect relationships (which include biological and biochemical function) need to be established experimentally and observationally,independent of any particular theory. Once these relationships are determined, they can then be used to evaluate the theories at hand. Do the theories predict (or at least accommodate) the established cause-and-effect relationships, or not?

Using a theory-dependent approach poses the very real danger that experimentally determined cause-and-effect relationships (or, in this case, biological functions) will be discarded if they don’t fit the theory. And, again, it should be the other way around. A theory should be discarded, or at least reevaluated, if its predictions don’t match these relationships.

What difference does it make which definition of function Graur uses in his model? A big difference. The selected-effect definition is more restrictive than the causal-role definition. This restrictiveness translates into overlooked function and increases the replacement level fertility rate.

4. Buffering against deleterious mutations is a function.

As part of his model, Graur argues that junk DNA is necessary in the human genome to buffer against deleterious mutations. By adopting this view, Graur has inadvertently identified function for junk DNA. In fact, he is not the first to argue along these lines. Biologist Claudiu Bandea has posited that high levels of junk DNA can make genomes resistant to the deleterious effects of transposon insertion events in the genome. If insertion events are random, then the offending DNA is much more likely to insert itself into “junk DNA” regions instead of coding and regulatory sequences, thus protecting information-harboring regions of the genome.

If the last decade of work in genomics has taught us anything, it is this: we are in our infancy when it comes to understanding the human genome. The more we learn about this amazingly complex biochemical system, the more elegant and sophisticated it becomes. Through this process of discovery, we continue to identify functional regions of the genome—DNA sequences long thought to be “junk.”

In short, the criticisms of the ENCODE project reflect a deep-seated commitment to the evolutionary paradigm and, bluntly, are at war with the experimental facts.

Bottom line: if the ENCODE results stand, it means that key aspects of the evolutionary paradigm can’t be correct.

Resources to Go Deeper

Endnotes

  1. Dan Graur, “An Upper Limit on the Functional Fraction of the Human Genome,” Genome Biology and Evolution 9 (July 2017): 1880–85, doi:10.1093/gbe/evx121.

DNA Replication Winds Up the Case for Intelligent Design

dnareplicationwindsup
BY FAZALE RANA – AUGUST 8, 2017

One of my classmates and friends in high school was a kid we nicknamed “Radar.” He was a cool kid who had special needs. He was mentally challenged. He was also funny and as good-hearted as they come, never causing any real problems—other than playing hooky from school, for days on end. Radar hated going to school.

When he eventually showed up, he would be sent to the principal’s office to explain his unexcused absences to Mr. Reynolds. And each time, Radar would offer the same excuse: his grandmother died. But Mr. Reynolds didn’t buy it—for obvious reasons. It didn’t require much investigation on the principal’s part to know that Radar was lying.

Skeptics have something in common with my friend Radar. They use the same tired excuse when presented with compelling evidence for design from biochemistry. Inevitably, they dismiss the case for a Creator by pointing out all the “flawed” designs in biochemical systems. But this excuse never sticks. Upon further investigation, claimed instances of bad designs turn out to be elegant, in virtually every instance, as recent work by scientists from UC Davis illustrates.

These researchers accomplished an important scientific milestone by using single molecule techniques to observe the replication of a single molecule of DNA.1 Their unexpected insights have bearing on how we understand this key biochemical operation. The work also has important implications for the case for biochemical design.

For those familiar with DNA’s structure and replication process, you can skip the next two sections. But for those of you who are not, a little background information is necessary to appreciate the research team’s findings and their relevance to the creation-evolution debate.

DNA’s Structure

DNA consists of two molecular chains (called “polynucleotides”) aligned in an antiparallel fashion. (The two strands are arranged parallel to one another with the starting point of one strand of the polynucleotide duplex located next to the ending point of the other strand and vice versa.) The paired molecular chains twist around each other forming the well-known DNA double helix. The cell’s machinery generates the polynucleotide chains using four different nucleotides: adenosineguanosinecytidine, and thymidine, abbreviated as A, G, C, and T, respectively.

A special relationship exists between the nucleotide sequences of the two DNA strands. Biochemists say the DNA sequences of the two strands are complementary. When the DNA strands align, the adenine (A) side chains of one strand always pair with thymine (T) side chains from the other strand. Likewise, the guanine (G) side chains from one DNA strand always pair with cytosine (C) side chains from the other strand. Biochemists refer to these relationships as “base-pairing rules.” Consequently, if biochemists know the sequence of one DNA strand, they can readily determine the sequence of the other strand. Base-pairing plays a critical role in DNA replication.

 

Image 1: DNA’s Structure

DNA Replication

Biochemists refer to DNA replication as a “template-directed, semiconservative process.” By “template-directed,” biochemists mean that the nucleotide sequences of the “parent” DNA molecule function as a template, directing the assembly of the DNA strands of the two “daughter” molecules using the base-pairing rules. By “semiconservative,” biochemists mean that after replication, each daughter DNA molecule contains one newly formed DNA strand and one strand from the parent molecule.

 

Image 2: Semiconservative DNA Replication

Conceptually, template-directed, semiconservative DNA replication entails the separation of the parent DNA double helix into two single strands. By using the base-pairing rules, each strand serves as a template for the cell’s machinery to use when it forms a new DNA strand with a nucleotide sequence complementary to the parent strand. Because each strand of the parent DNA molecule directs the production of a new DNA strand, two daughter molecules result. Each one possesses an original strand from the parent molecule and a newly formed DNA strand produced by a template-directed synthetic process.

DNA replication begins at specific sites along the DNA double helix, called “replication origins.” Typically, prokaryotic cells have only a single origin of replication. More complex eukaryotic cells have multiple origins of replication.

The DNA double helix unwinds locally at the origin of replication to produce what biochemists call a “replication bubble.” During the course of replication, the bubble expands in both directions from the origin. Once the individual strands of the DNA double helix unwind and are exposed within the replication bubble, they are available to direct the production of the daughter strand. The site where the DNA double helix continuously unwinds is called the “replication fork.” Because DNA replication proceeds in both directions away from the origin, there are two replication forks within each bubble.

 

Image 3: DNA Replication Bubble

DNA replication can only proceed in a single direction, from the top of the DNA strand to the bottom. Because the strands that form the DNA double helix align in an antiparallel fashion with the top of one strand juxtaposed with the bottom of the other strand, only one strand at each replication fork has the proper orientation (bottom-to-top) to direct the assembly of a new strand, in the top-to-bottom direction. For this strand—referred to as the “leading strand”—DNA replication proceeds rapidly and continuously in the direction of the advancing replication fork.

DNA replication cannot proceed along the strand with the top-to-bottom orientation until the replication bubble has expanded enough to expose a sizable stretch of DNA. When this happens, DNA replication moves away from the advancing replication fork. DNA replication can only proceed a short distance for the top-to-bottom-oriented strand before the replication process has to stop and wait for more of the parent DNA strand to be exposed. When a sufficient length of the parent DNA template is exposed a second time, DNA replication can proceed again, but only briefly before it has to stop again and wait for more DNA to be exposed. The process of discontinuous DNA replication takes place repeatedly until the entire strand is replicated. Each time DNA replication starts and stops, a small fragment of DNA is produced.

Biochemists refer to these pieces of DNA (that will eventually compose the daughter strand) as “Okazaki fragments”—after the biochemist who discovered them. Biochemists call the strand produced discontinuously the “lagging strand” because DNA replication for this strand lags behind the more rapidly produced leading strand. One additional point: the leading strand at one replication fork is the lagging strand at the other replication fork since the replication forks at the two ends of the replication bubble advance in opposite directions.

An ensemble of proteins is needed to carry out DNA replication. Once the origin recognition complex (which consists of several different proteins) identifies the replication origin, a protein called “helicase” unwinds the DNA double helix to form the replication fork.

 

Image 4: DNA Replication Proteins

Once the replication fork is established and stabilized, DNA replication can begin. Before the newly formed daughter strands can be produced, a small RNA primer must be produced. The protein that synthesizes new DNA by reading the parent DNA template strand—DNA polymerase—can’t start production from scratch. It must be primed. A massive protein complex, called the “primosome,” which consists of over 15 different proteins, produces the RNA primer needed by DNA polymerase.

Once primed, DNA polymerase will continuously produce DNA along the leading strand. However, for the lagging strand, DNA polymerase can only generate DNA in spurts to produce Okazaki fragments. Each time DNA polymerase generates an Okazaki fragment, the primosome complex must produce a new RNA primer.

Once DNA replication is completed, the RNA primers are removed from the continuous DNA of the leading strand and from the Okazaki fragments that make up the lagging strand. A protein called a “3’-5’ exonuclease” removes the RNA primers. A different DNA polymerase fills in the gaps created by the removal of the RNA primers. Finally, a protein called a “ligase” connects all the Okazaki fragments together to form a continuous piece of DNA out of the lagging strand.

Are Leading and Lagging Strand Polymerases Coordinated?

Biochemists had long assumed that the activities of the leading and lagging strand DNA polymerase enzymes were coordinated. If not, then DNA replication of one strand would get too far ahead of the other, increasing the likelihood of mutations.

As it turns out, the research team from UC Davis discovered that the activities of the two polymerases are not coordinated. Instead, the leading and lagging strand DNA polymerase enzymes replicate DNA autonomously. To the researchers’ surprise, they learned that the leading strand DNA polymerase replicated DNA in bursts, suddenly stopping and starting. And when it did replicate DNA, the rate of production varied by a factor of ten. On the other hand, the researchers discovered that the rate of DNA replication on the lagging strand depended on the rate of RNA primer formation.

The researchers point out that if not for single molecule techniques—in which replication is characterized for individual DNA molecules—the autonomous behavior of leading and lagging strand DNA polymerases would not have been detected. Up to this point, biochemists have studied the replication process using a relatively large number of DNA molecules. These samples yield average replication rates for leading and lagging strand replication, giving the sense that replication of both strands is coordinated.

According to the researchers, this discovery is a “real paradigm shift, and undermines a great deal of what’s in the textbooks.”Because the DNA polymerase activity is not coordinated but autonomous, they conclude that the DNA replication process is a flawed design, driven by stochastic (random) events. Also, the lack of coordination between the leading and lagging strands means that leading strand replication can get ahead of the lagging strand, yielding long stretches of vulnerable single-stranded DNA.

Diminished Design or Displaced Design?

Even though this latest insight appears to undermine the elegance of the DNA replication process, other observations made by the UC Davis research team indicate that the evidence for design isn’t diminished, just displaced.

These investigators discovered that the activity of helicase—the enzyme that unwinds the double helix at the replication fork—somehow senses the activity of the DNA polymerase on the leading strand. When the DNA polymerase stalls, the activity of the helicase slows down by a factor of five until the DNA polymerase catches up. The researchers believe that another protein (called the “tau protein”) mediates the interaction between the helicase and DNA polymerase molecules. In other words, the interaction between DNA polymerase and the helicase compensates for the stochastic behavior of the leading strand polymerase, pointing to a well-designed process.

As already noted, the research team also learned that the rate of lagging strand replication depends on primer production. They determined that the rate of primer production exceeds the rate of DNA replication on the leading strand. This fortuitous coincidence ensures that as soon as enough of the bubble opens for lagging strand replication to continue, the primase can immediately lay down the RNA primer, restarting the process. It turns out that the rate of primer production is controlled by the primosome concentration in the cell, with primer production increasing as the number of primosome copies increase. The primosome concentration appears to be fine-tuned. If the concentration of this protein complex is too large, the replication process becomes “gummed up”; if too small, the disparity between leading and lagging strand replication becomes too great, exposing single-stranded DNA. Again, the fine-tuning of primosome concentration highlights the design of this cellular operation.

It is remarkable how two people can see things so differently. For scientists influenced by the evolutionary paradigm, the tendency is to dismiss evidence for design and, instead of seeing elegance, become conditioned to see flaws. Though DNA replication takes place in a haphazard manner, other features of the replication process appear to be engineered to compensate for the stochastic behavior of the DNA polymerases and, in the process, elevate the evidence for design.

And, that’s no lie.

Resources

Endnotes

  1. James E. Graham et al., “Independent and Stochastic Action of DNA Polymerases in the Replisome,” Cell 169 (June 2017): 1201–13, doi:10.1016/j.cell.2017.05.041.
  2. Bec Crew, “DNA Replication Has Been Filmed for the First Time, and It’s Not What We Expected,” ScienceAlert, June 19, 2017, https://sciencealert.com/dna-replication-has-been-filmed-for-the-first-time-and-it-s-stranger-than-we-thought.
Reprinted with permission by the author
Original article at:
https://www.reasons.org/explore/blogs/the-cells-design/read/the-cells-design/2017/08/08/dna-replication-winds-up-the-case-for-intelligent-design