April 18, 2003
"Scientists Say Human Genome Is Complete "
By NICHOLAS WADE
From The New York Times
April 14, 2003
The human genome is complete and the Human Genome Project is over, leaders of a public consortium of academic centers said today.
"We have before us the instruction set that carries each of us from the one-cell egg through adulthood to the grave," Dr. Robert Waterston, a leading genome sequencer, said at a news conference here at the National Institutes of Health.
Their announcement marked the end of a scientific venture that began in October 1990 and was expected to take 15 years.
Today's finishing date, two years ahead of schedule, was timed to coincide with the 50th anniversary of the discovery of the structure of DNA by Dr. James D. Watson and Dr. Francis Crick. Their article appeared in the April 25, 1953, issue of Nature.
Dr. Watson, who became the first director of the Human Genome Project at the institutes, was at a conference here today to celebrate the genome's completion. He had sought that goal, he said, realizing that a family member's illness would never be treatable "until we understand the human program for health and disease."
A "working draft" of the human genome sequence was announced with much fanfare three years ago in a White House ceremony. But at that stage the Human Genome Project had completed only 85 percent of the genome and its commercial rival, the Celera Corporation, using the project's data as well as its own, had attained somewhat more. The project's draft was not a thing of beauty. It consisted of thousands of short segments of DNA, whose order and orientation in the full genome was largely unknown.
Three years later, the international consortium of genome sequencing centers has now put all the fragments in order and closed most of the gaps, producing an extensive and highly accurate sequence of the 3.1 billion units of DNA of the human genome.
The data, perceived as the foundation of a new era of medicine, will be posted for free on genetic data banks. Celera, whose data are available by subscription, never intended to carry its draft genome to completion.
The working draft of three years ago contained most human genes and was useful for researchers seeking a specific gene. But up to a year ago biologists said they often had to do considerable extra sequencing work on the DNA regions they were interested in.
The completed genome announced today is far more accurate. It can be used out of the box, so to speak, without extra resequencing. The genes and other important elements of the genome are now almost all in their correct position, a vital requirement for researchers seeking to locate a gene that contributes to disease.
Scientists praised the Human Genome Project for its further three years of hard work and for producing a resource of enormous value to research. But several qualified their admiration by noting that even if the project is complete, the human genome is not. The parts of the genome still missing are of minor importance, but many biologists would like to see them sequenced before declaring the genome finished.
The human genome is packaged in 23 pairs of chromosomes, each a giant molecule of DNA. Though DNA's best-known role is to encode the information needed to build specific proteins, the working parts of the living cell, some of the DNA performs structural roles. This includes the DNA at the tips of each chromosome and at the center. The tip and center DNA, known as heterochromatic DNA, consists of monotonously repeated sequences whose exact order of units is so hard to determine that the consortium's leaders said from the outset they would not try to do so.
Within the rest of the DNA, known as euchromatic DNA, some regions are very hard to sequence for technical reasons. For example, they may contain DNA that is toxic to the bacteria used to amplify them. Foreseeing such difficult regions, the consortium said it would accept some gaps in the eventual sequence, provided their length was known.
When the working draft of the human genome was produced, consortium scientists called it the "Book of Life," with each chromosome a chapter. In the edition published today, small sections at the beginning, end and middle of each chapter are blank, along with some 400 assorted paragraphs whose text is missing, although the length of the missing passages is known.
The missing paragraphs amount to only 0.8 percent of the euchromatic DNA, which is 2.9 billion base pairs, or DNA units, in length. The total length of the genome, with heterochromatic DNA included, is 3.1 billion base pairs. Because most of the chromosomes have only just been completed — the laggards straggled in only last week — genome analysts have not yet had time to compute the exact number of human genes, put at around 30,000 in earlier estimates.
Dr. Francis Collins, director of the genome center at the National Institutes of Health, said the Human Genome Project had completed the task it set itself and was today dissolved. The era of large-scale DNA sequencing was now over, he said, although research projects would continue to develop technology to close remaining gaps. "If you are looking for a disease gene you can be confident that it exists in one continuous stretch of highly accurate sequence," he said of the genome data now available.
Dr. Huntington F. Willard, an expert on the X chromosome at Duke, said that the human genome sequence now available was "a momentous achievement" but that "we shouldn't declare the job `complete' until it is." He said it was "critical that the complete human genome sequence be, well, complete, in the fullness of time."
Dr. Evan Eichler, a computational biologist at Case Western Reserve University who studies certain duplicated regions of the genome, said, "For the vast majority of users, this is in fact an operational completion." But, like Dr. Willard, he said work on the genome should continue until "every base is completely in place." The task might take 10 to 20 years, he said, and he expressed concern that the effort might not be sustained.
A prime beneficiary of the essentially completed genome is DeCode Genetics of Reykjavik, Iceland, which is screening the entire Icelandic population for disease-causing variant genes. Dr. Kari Stefansson, the president of the company, said the single base variants known as SNP's were now accurately assigned on the genome sequence 99 percent of the time, compared with 93 percent accuracy previously. The SNP's, which make one person's genome different from another's, are helpful in pinpointing errant genes providing that the position of the SNP's on the genome is known with accuracy.
Dr. Stefansson said the current version of the human genome was "absolutely wonderful to have" but that it was "silly" to claim it was completed.
Two laboratory organisms whose genomes were sequenced as pilot projects for the human genome, the C. elegans roundworm and the Drosophila fruit fly, are in a more complete state than the human genome. Every single base of the roundworm genome is known. Dr. Gerald M. Rubin of the Howard Hughes Medical Institute in Chevy Chase, Md., who oversees the fruit fly project, said that the human genome could not be called finished but that there has been "a tremendous increase in the value" of the sequence over the last two years. "The people who stayed in the trenches deserve a lot of credit, even though the glory may have been claimed by others," he said.
The principal contributors to the human genome sequence are the Sanger Institute near Cambridge in England, which has done 30 percent of the sequence, the Whitehead Institute in Cambridge, Mass., and the Genome Sequencing Center at Washington University in St. Louis. Other contributors include the Baylor College of Medicine, the Department of Energy's Joint Genome Institute, and centers in Japan, France, Germany and China.
Dr. Richard Wilson, director of the Washington University center, said the hardest chromosome had been the Y chromosome, which is small but has many highly repetitive sequences that are hard to tell apart. In Chromosome 7, the individual being sequenced possesses a gene not found in other people, Dr. Wilson said.
The sequence of each of the 24 human chromosomes was put together by a chromosome coordinator. Each coordinator's work was checked against independent data developed by Dr. David Jaffe and Dr. Eric Lander at the Whitehead Institute.
The Human Genome Project was originally projected to cost a total of $3 billion. Money spent by the National Institutes of Health and the Department of Energy since the beginning of the project has come to $2.7 billion, but that does not include spending by the Sanger Institute and other foreign collaborators.
The total spending of the Human Genome Project includes pilot projects like sequencing the roundworm and fruit fly genomes. No exact figure was given at today's press conference for sequencing the human genome specifically. But the Sanger Institute has spent 150 million pounds, or about $235 million, to sequence 30 percent of the genome, and on that basis would have required 500 million pounds, or $786 million at the current rate of exchange, to do all of it.
Though the Human Genome Project has been declared completed, the genome sequencing centers will not go out of business. They have switched to decoding the genomes of other species, and to exploring variations in the human genome.
Obtaining the sequence of the human genome is a first step. Biologists must now annotate it, or identify the regions of DNA that hold the genes and their control elements. Next come tasks like discovering the variations in DNA sequence that contribute to disease in different populations, defining the proteins produced by each gene, and understanding how the proteins in each cell interact in a circuitry that controls the operation of the genome.