Annals of Surgical Oncology Sign the Guestbook
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS

This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Yeatman, T. J.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Yeatman, T. J.
Annals of Surgical Oncology 10:7-14 (2003)
© 2003 Society of Surgical Oncology


EDUCATIONAL REVIEW

The Future of Cancer Management: Translating the Genome, Transcriptome, and Proteome

Timothy J. Yeatman, MD, FACS

From the Departments of Surgery and Interdisciplinary Oncology, H. Lee Moffitt Cancer Center at the University of South Florida, Tampa, Florida.

Correspondence: Address correspondence and reprint requests to: Timothy J. Yeatman, MD, FACS, Departments of Surgery and Interdisciplinary Oncology, H. Lee Moffitt Cancer Center, 12902 Magnolia Dr., Tampa, FL 33612; Fax: 813-979-3893; E-mail: yeatman{at}moffitt.usf.edu

ABSTRACT

Abstract: Predicting who will develop cancer and how the cancer will behave and respond to therapy after diagnosis are some of the potential benefits of the ongoing genetic revolution that can be envisioned within the next decade. Translational applications of genomic-based research efforts may actually precede the development of effective therapeutic agents that can exploit the vast amounts of data derived from these efforts. In the future, understanding the wealth of information generated by high-throughput molecular efforts and how it can be applied to clinical problems will likely be critical to the surgeon who guides the multidisciplinary care of the cancer patient. This review will discuss the advances in our understanding of the human genome (DNA), its derived transcriptome (RNA), and its translated proteome (proteins) and will focus on the translation of this information into routine clinical practice. In particular, we will focus on the potential for clinical application of microarray-based gene-expression profiling to the diagnosis, prognosis, and therapy of malignancies.

Key Words: Genome • Transcriptome • Proteome • Microarray

We have entered a new world. There are new ideas, new terms and definitions, and many new genes. Whereas human genetics has a long history marked by events with historical significance, sequencing the genome is likely just the beginning of another renaissance. Multiple Nobel prizes have already been awarded in the discipline of genetics, including awards for the discovery of restriction enzymes, DNA-sequencing techniques, the polymerase chain reaction, and the structure of DNA itself. Not but 16 years ago, there was significant debate about the feasibility of sequencing the human genome.1 Recent articles have now put the controversy to rest by using two different approaches to develop draft sequences of the human genome. The international Human Genome Project2 used the hierarchical shotgun approach, whereas Celera Genomics adopted the whole-genome shotgun3 approach. The end result of these two efforts is a first draft of the entire human genome sequence. Clinical researchers, practicing physicians, patients, and the general public now have access to the 2.9 billion nucleotide codes of the human genome that are available as a resource for scientific investigation (Web sites: http://www.ensembl.org and http://public.celera.com/cds/login.cfm). The human genome sequence provides a record of who we are and how we have evolved. It holds promise in the understanding of all inherited traits. It is the key to understanding human disease and its predispositions. It is the blueprint for our destiny.

Structure of the Genome

Surprisingly, it has been reported that only approximately 1% of the human genome is actually composed of exons that code for protein structure. Twenty-five percent of the genome represents intronic sequences or regions of DNA between exons that are spliced out; 75% of the genome is composed of intergenic DNA for which we have no significant understanding of its function or role in the process of gene transcription into RNA or translation into protein. Despite the large phenotypic differences between individuals, the similarity in the structure of human DNA from one individual to another is striking. It has been estimated that we share at least 99.9% of our nucleotide code, suggesting that any two individuals differ by only .1%.3 Because the genome has evolved with time, it can be considered a historical record. This concept is reinforced by the large degree of homology between species, allowing us to trace our origins backward in time throughout the evolutionary process. In this regard, the genome carries significant information as to how our ancestors evolved and were affected by their environment through natural selective processes. For example, it is believed that proto-oncogenes, such as c-Src, were incorporated into our genome long ago via retroviral infections to assume roles in cellular growth and replication. Now we know that these proto-oncogenes may be activated through mutational events to perform the abnormal functions that permit cellular transformation (i.e., oncogenesis). Throughout the genome, there are numerous nonrandom repeat elements that have been identified. These include lone interspersed repetitive elements and short interspersed repetitive elements or Alu sequences that have no known function to date. They do, however, comprise up to 10% of the human genome and colocalize with the gene-rich regions. The genome also has a great amount of apparent duplication for the presumed purpose of generating related gene superfamilies. This gene duplication, occurring at a rate of 10- to 100-fold greater than the fruit fly or worm, may represent a distinguishing structural event that separates humans from lower species.

Annotation of the Genome

With the recent announcement of the completion of the first draft of the human genome sequence, to be followed soon by the submission of a final draft, we are now faced with the challenging task of applying this vast wealth of information to the study of human disease. In this regard, sequencing the human genome is just one of a large number of tasks that will have to be completed before real progress is realized. For example, although the sequence is nearly complete, the genome is not yet completely annotated.4 Annotation involves the prediction of where a gene’s structure begins and ends and, in essence, is the source of our predictions of the number of individual genes in our genome. Annotation provides a predicted map of the genes that will ultimately require careful validation. One approach to validation involves the prediction of the existence of a gene on the basis of evidence derived from multiple databases from multiple species. Identifying numerous orthologs for genes across several species may validate the existence of a particular human gene. It is interesting that relatively few human genes (roughly 30,000–50,000) have been predicted on the basis of sequence analyses of the human genome. This is a surprise when 14,000 genes in the fruit fly,5 19,000 genes in the roundworm,6 and 26,000 genes in the mustard plant7 have been identified. Homology analyses suggest a strong conservation of genetic information throughout evolution. Identification of homologs and orthologs will ultimately assist in the discovery of the function of individual genes. Comparative genomics offers the promise of understanding the function of genes through evaluation of homologous genes in lower species. There are now software tools available to the public that are capable of performing homology searches that permit the comparison of genes across numerous species. One such site is provided by The Institute for Genomic Research (http://www.tigr.org). This site is also quite useful for researchers performing gene-expression profiling experiments.

Integration of the Human Genome

Major events, such as landing on the moon, spawned significant technological breakthroughs that had effects decades after the initial event. These effects were widespread, sparing few, if any, scientific disciplines. The completion of the Human Genome Project will likely herald a similar explosion of events in this age of information. Although it is impossible to predict all of the fallout from the completion of the genome, there are emerging concepts that suggest great potential for translational applications of this large body of information. One example is the development of large databases to assess the variation that exists in our DNA code. We have long known that DNA polymorphism exists but are now just beginning to catalog what are likely to be millions of alterations in genetic sequence between any two individuals. These alterations are termed single nucleotide polymorphisms (SNPs) and represent the single base pair substitution of one nucleotide for another in a DNA strand. SNPs are thought to be distinct from disease-causing mutations because they cause no discernible phenotype. They can, however, affect gene function and may predispose to disease even without altering protein-coding structure. In fact, <1% of SNPs actually occur within exons. These alterations can occur on average in 1 out of 1250 nucleotides and may affect the regions that control the expression of genes, which in turn can affect their function. The promise of SNPs in clinical medicine relates to their potential capacity to predict our predisposition to disease and our response to therapy. A new field of study, termed pharmacogenomics, has been developed to determine which patients will experience toxicity when exposed to a drug versus those who are seemingly unaffected by therapy. SNPs may play a key role in making these critical clinical predictions.810 Another example of how knowledge of the genome sequence may become useful relates to the promoter regions of genes that control their transcription. There is an emerging body of data suggesting that the CpG islands associated with promoter regions are susceptible to hypermethylation, which seems to be associated with the suppression of gene transcription.11,12 These alterations in the DNA are considered epigenetic alterations because the actual genetic code is not altered. Recent technology has supported the use of high-throughput technology to assess the presence of hypermethylated CpG islands across thousands of genes, and these data may provide considerable insight as to the future biological behavior of a tumor.1113

THE HUMAN TRANSCRIPTOME

The Challenge of Interrogating Gene Expression
Although the prediction of a gene on the basis of annotation criteria may strongly suggest the existence of a particular gene, this does not guarantee the expression of a functional messenger RNA transcript. Characterizing the population of transcribed genes has led to the creation of a new term, the transcriptome.14 This concept attempts to define the large number of transcripts that can result from both unspliced and spliced gene products (resulting from skipped exons). Interestingly, whereas recent estimates suggest upwards of 50,000 genes in the human genome, many more transcripts (and, potentially, protein products of those transcripts) may exist. Thus, the complexity of the human over other subspecies may be derived in part from differences in RNA splicing. The transcriptome, therefore, represents the universe of RNA messages that may code for proteins when properly instructed to do so. Because the technology has recently been developed to interrogate the diversity of the transcriptome, gene-expression profiling has become a mainstay of modern molecular biologic research.

Gene-expression analysis was tedious and possible only on a gene-by-gene basis until recently, when large-scale gene-profiling technologies were developed. Before these technological advances, differential gene-expression analysis was relegated to Northern blots, reverse-transcriptase polymerase chain reaction, differential display, and subtractive hybridization. Currently, expression profiling is accomplished by two or more different platforms that include primarily complementary DNA (cDNA) spotted arrays and oligonucleotide-based arrays (Fig. 1, A and B).15 These arrays can be home-grown or may be commercially available. Oligonucleotide arrays generally use small DNA sequences from 20 to 70 mers in length to recognize and distinguish individual genes. Spotted cDNA arrays are constructed by spotting down thousands of longer portions of DNA, each representing an individual gene. The oligonucleotide arrays have the potential benefit of identifying splice products, whereas the cDNA arrays generally cannot. Both platforms have shown promise in producing large quantities of reproducible data that can subsequently be mined by a number of techniques. Typically, RNA is derived from a tumor and a control specimen or panel of cell lines and is converted to cDNA that is fluorescently labeled and hybridized to the target chip (Fig. 2). On a typical high-density array, there may be 12,000 to 60,000 genes represented, most of which are unnamed genes with no known functions. Expression profiling is a very data-rich process that produces large lists of named genes that may have direct or indirect causal links to the perturbation examined; however, many of the alterations may simply be casual rather than causal associations. Furthermore, gene-expression profiling can produce large lists of unnamed genes often designated by gene clone numbers or expressed sequence tag numbers that have no functional meaning or significance to the casual observer.



View larger version (81K):
[in this window]
[in a new window]
 
FIG. 1. Gene-expression profiling is performed on two principal platforms by using oligonucleotides or complementary DNAs (cDNAs) to interrogate RNA derived from tumor and normal samples. (A) The Affymetrix GeneChipTM (Santa Clara, CA) is shown on the left panel, with a high-power view of the chip’s composition on the right panel. The U133 GeneChipTM is composed of multiple (n = 11) pairs of oligonucleotides (short pieces of DNA) that may recognize roughly 12,000 specific genes. The top row of oligonucleotides is designed to match various portions of the gene precisely, whereas the bottom row of mismatched oligonucleotides (at a single base pair) is designed as the negative control. Gene expression is determined by an algorithm that incorporates the differential hybridization intensity of messenger RNA to the top row of oligonucleotides versus the bottom row of oligonucleotides. (B) Spotted cDNA arrays are derived by applying small amounts of cDNAs (each representing an individual gene) to a glass microscope slide. Shown in this figure is a 32,000-element high-density gene discovery array (left panel), with a magnified view of one of the subsections of the chip showing the detail of the individual cDNA spots.

 


View larger version (95K):
[in this window]
[in a new window]
 
FIG. 2. The process of microarray hybridization. The basic process of hybridization underlies all chip-based microarray technology. RNA is derived from tumors and a control tissue (often normal tissue or cell lines). RNA is then converted to complementary DNA (cDNA), labeled, and hybridized to the microarray chip. The expression of individual genes is then interpreted on the basis of the intensity by which the cDNA hybridizes to the individual genes represented on the chip by oligonucleotides or cDNAs. RT, reverse transcription; mRNA, messenger RNA.

 
The challenge to understand the function of all human genes is a daunting task at best and likely represents the challenge of the next millennium. To date, nearly 15,000 named human genes exist, suggesting that we are well on our way to cataloging the genes and their functions. However, the real truth is that, despite an attached name, the actual function of a named gene may be far from pinned down.4 For example, we can examine any one of a large number of named genes, such as a phosphatase, and understand in part its normal function but not understand any of its molecular partners or pathways in which it is active. Moreover, although we may have an idea of the function of a gene in its wild-type form, we may have no clue as to the effect of the mutational events affecting its structure. Notable examples of the significant effects of mutation on gene function include APC, RAS, and P53, all of which take on significantly different functions when mutated.

To interrogate and translate these complex data requires a new set of analysis tools with which many clinicians are not familiar.15 The generation of large sums of data by gene-expression profiling requires extensive computational analysis. More importantly, it requires the development of new mathematical algorithms and software to handle the data. New databases are required to hold the data adjacent to invaluable clinical information that will need to be queried and updated on a frequent, almost real-time, basis. These requirements have led to the development of at least three new scientific disciplines: computational biology, bioinformatics, and biostatistics.

There has been somewhat of a paradigm shift in the evaluation of basic science proposals. What would have been considered a massive "fishing expedition" 5 years ago is now considered scientifically valuable and has been termed discovery science. Discovery science has gained a respectable reputation, not only for the vast amounts of potentially invaluable data that it produces, but also for the hypotheses that it generates.

A Call for Interdisciplinary Science: Systems Biology
The surgeon’s role in the development of functional genomics efforts is paramount. The surgeon provides a link between the patient and the science though his or her capacity to extirpate information-rich normal and neoplastic tissues. The practicing surgeon, with a daily exposure to the battlefields of cancer, can play a significant role in focusing science on the clinically relevant problems. In this regard, the skilled clinician can provide a wealth of intuition to science that may lead it to a successful outcome. The translation of clinical acumen to science, however, cannot occur unless the clinician is well versed in the doctrines of science. This is true for a number of reasons. For example, the surgeon without a minimal understanding of genetic principles cannot effectively communicate with the basic scientist counterpart.

The role of the physician-scientist thus becomes more important if this paradigm is to be successful. Many clinical departments boast of large sums of National Institutes of Health funding, but it is not uncommon that the principal investigators within these departments are not actually the physicians themselves but rather the employees of the department. Although this may be a successful model in numerous instances, it does once again remove the surgeon another step away from the scientific discovery process.

By no means can the surgeon or clinician accomplish translational research alone or in a vacuum. Much as cancer centers specializing in cancer care have determined that multidisciplinary groups are necessary to deliver the best care to the patient, we are now becoming aware of the same requirements for the effective advancement of translational science. Just as it is ideal to have the surgeon, radiotherapist, pathologist, radiologist, and medical oncologist in the same room to discuss the real-time management of a patient’s disease, it is also ideal to bring together the necessary disciplines in science to meet the task at hand. This thought process has birthed the concept of systems biology, a new multidisciplinary approach to scientific discovery. The challenges of translating the benefits of sequencing the human genome to clinical medicine will require large concerted interdisciplinary efforts to be successful. In this regard, there is a need for multiple scientific disciplines to join together to solve problems, and there is a need for computer scientists to develop and write software to house and query the large datasets that are rapidly accruing. Mathematicians are needed to develop statistically based algorithms to analyze the data. Molecular biologists are indispensable to provide insight as to the significance of gene clusters and identifications. Pathologists are important for their capacity to identify suitable tissues and to perform microdissection to ensure that what is being examined in microarray analysis is what is desired. Chemists are necessary to assist in the drug-discovery process, which is yet another discovery science in itself. Molecular targets are now defined, and chemical agents addressing these novel targets are defined. The surgeon’s role goes beyond that of harvesting tissue (although this is a critical role): the surgeon can also provide important insight into the clinically relevant problems that need to be addressed by science through proper experimental design. Point in fact is the development of institutes dedicated to the new field of systems biology.16,17 This interdisciplinary model goes beyond the term functional genomics in that it has directly implicated multiple scientific disciplines, working side by side, to address the presumed complexity of intermolecular relationships and networks in the cell. This is the future of scientific endeavor.

The Effect of Creative Experimental Design
The challenge of gene-expression profiling is translating the acquired data into something that may ultimately have a clinical effect. What seems to be a simple idea of comparing cell line A with cell line B, each with a definable genotype and phenotype, may turn into a large, complex list of expressed genes with no clear associations. Perhaps more confusing is the potential for clustering programs to permit the development of apparent molecular relationships that may not exist. For example, if two sets of microarray data are clustered by a hierarchical clustering algorithm, there will always be a positive and enticing result. In other words, there will always be gene sets that cluster on the basis of their direction and degree of over- or underexpression. However, just because gene A clusters with gene B does not imply that they are functionally related. How do we make sense of all of this information? The answer probably lies in how we design our experiments. Thoughtful experimental design can help weed out much confounding genetic noise that might inhibit progress toward understanding the function of genes. Gene-expression profiling can be used for many different types of applications that may be simple or complex in design. For example, whereas gene-expression profiling can be used to compare cell line A with cell line B, a more complex and perhaps informative design would be a comparison of cell line A with or without a drug versus cell line B with or without a drug.

We recently explored the potential for gene-expression profiling to identify new tumor markers and new tumor-progression markers for human colorectal cancer. Although a significant number of markers have been identified, few have been developed to the point that they are widely used in clinical practice. In fact, CEA is perhaps the only marker in widespread use. Because we sought to examine the expressed tumor-specific genes in a population of patients, we were interested in identifying the expressed genes common to most people in a selected group. For this reason, we rationalized that the gene expression of pooled patient samples might be similar to that derived from examination of large numbers of individual samples, yet would be significantly more efficient. RNA derived from tumor samples in groups of 5 to 10 was then pooled in equimolar amounts to produce a mixture that could be assessed by microarray. Before large numbers of human tumor tissues were examined, the pooling concept was first validated by experiments designed to compare a physical pool with the calculated mathematical pool by means of individual sample analysis. These experiments strongly suggested that pooling was a valid technique when information regarding the behavior of a population was sought. The pooling process was then applied to six sets of normal and tumor tissues derived from different clinical stages (benign mucosa, n = 10; benign adenoma, n = 10; liver metastases, n = 10; and carcinomas: Astler-Collier B1, n = 10; C2, n = 10; and D, n = 10). Interestingly, these experiments led to the identification of >300 tumor markers (distinguishing cancer from normal) and >100 tumor-progression markers (distinguishing one stage of cancer from another). Of the tumor-progression markers, the lead marker was identified to be osteopontin, a secreted glycoprotein with numerous functions that have been related to the progression and metastasis of cancer.

The Pervasive Effects of Unraveling the Transcriptome
Twenty years ago, medical students were faced with what once seemed to be a daunting task—memorizing the elements of the Krebs cycle or other metabolic pathways. The students of the future, however, will soon be faced with the challenge of not only learning the classic biochemical pathways, but also learning many new intermolecular relationships that will emerge as novel pathways over time. For example, over the past decade we have witnessed the development of numerous signal transduction pathways. By using colon cancer as the paradigm, once the initial genetic elements were defined, a new neoplastic pathway was elucidated. The Wnt pathway, involving Apc, ß-catenin, and other related molecules, is now an established molecular pathway that contributes to colon cancer development and progression.18 In this regard, a thorough understanding of molecular biology and all of its basic tenets is a must for the contemporary medical student. Moreover, the teachers of these medical students must be facile with this body of information to communicate it properly. The immediate translational benefits of the Human Genome Project are quite expansive and include the capacity to identify and understand the effects of specific mutations. One example is the capacity to screen for and act on familial forms of cancer linked to specific inherited genetic alterations, such as those linked to familial polyposis, hereditary nonpolyposis colorectal carcinoma, and hereditary breast cancer. The fallout from this vast wealth of data will not be realized for many years to come.

THE HUMAN PROTEOME

Now that the Human Genome Project is nearing completion, focus may be realigned to the development of human protein indices that will ultimately identify the structure and function all human proteins, which may ultimately be more informative than understanding the sometimes evanescent messages associated with the transcriptome. The human proteome is the universe of human proteins and their isoforms that are constructed with the aid of messenger RNA and its splice products.19 Although this proteomics project was initiated nearly 40 years ago, it has not been pursued with the vigor of the Human Genome Project. With the development of protein indices comes the promise for the in vitro synthesis of these proteins for use in functional studies as well as for the production of antibodies that may have diagnostic, prognostic, and therapeutic applications. The challenge of identifying all human proteins is, however, enormous. Moreover, proteins may be posttranslationally modified by complex processes such as glycosylation and phosphorylation, which may be difficult to experimentally replicate. Human proteins, unlike DNA, are composed of up to 20 amino acids, rather than 4 base pairs, and proteins have a final processed 3-dimensional form that is significantly more complex and cannot be predicted from the blueprints of genes.

Simply identifying all of the human proteins in the proteome is a very significant challenge, but certainly deciphering protein structure and function for these proteins will consume the time of scientists for years to come. The current technology for modern proteomics has its roots in the development of isoelectric focusing in the first dimension and sodium dodecyl sulfate electrophoresis in the second dimension, techniques first reported in 1975, almost simultaneously, by Klose,20 O’Farrell,21 and Scheele.22 The introduction of advances in mass spectroscopy to the proteomics field has permitted the use of two-dimensional gel electrophoresis on a much larger scale, making this technology a viable tool for protein discovery and analysis.19

One very interesting statistic that has emerged from large-scale analyses of data derived from gene-expression profiling and synchronous two dimensional gel analyses is the concept that there is very little correlation (r = .48) between the presence or absence of an expressed message and that of the cognate protein.23 These sorts of studies suggest that although development of the proteome has taken a back seat to that of the genome and transcriptome, the analysis of protein expression may ultimately be a gold standard for the interpolation of gene expression. In recent years, the practice of proteomics research has experienced a dramatic shift within the pharmaceutical and biotechnology industries, with the widespread implementation of novel applications. The areas of interest extend all the way from discovery of novel drug, vaccine, and diagnostic targets, characterization of protein-based products, toxicology, and identification of surrogate markers of activity in clinical research to the ability to provide information on the mechanisms of drug action. The power of two-dimensional gel electrophoresis and advances in mass spectrometric techniques, combined with sequence database correlation, have enabled speed and accuracy in the identification of proteins in complex mixtures.24 The science of protein discovery is likely to be the next growth area in human biology.

Chip-Based Cancer Management: One Tumor, One Chip
The future of functional genomics is bright and holds great promise for the discovery of new genes, new messages, and new proteins. The term discovery science has been coined to describe the great potential for gene-expression profiling technologies to generate new hypotheses based on enormous datasets not previously available. With the capacity to evaluate thousands of genes and/or proteins in a single experiment by using microarray technology, the potential for clinical translation of these data to human cancer is also enormous. For example, we25 and others2629 have also begun to develop sophisticated molecular classifiers for a broad range of human cancers that may soon have clinical application. Currently, chip-based technologies are being used to derive gene expression patterns that predict the accurate tissue origin of a particular tumor on the basis of the simultaneous analysis of informative genes. For the first time, it is becoming possible to make the diagnosis of a particular cancer, as well as cancer subsets, without even examining the histology. This application of molecular profiling not only may eliminate the diagnostic category of the unknown primary cancer, but may also improve the diagnostic accuracy of current approaches by using classic histopathologic techniques combined with gene-by-gene immunohistochemical analyses. Moreover, it is now feasible to predict clinical outcome on the basis of gene-expression signatures,3033 making it feasible to direct therapy to patients who will actually derive benefit.

It is easy to envision that the future of clinical cancer management will be based on the development of functional genomics and related disciplines, with specific emphasis on chip-based analyses of tumors. We are entering a chip-based era in medicine that will positively affect multiple scientific and medical disciplines. Information-rich gene-expression datasets will be applied clinically to predict accurate diagnosis, prognosis, and possibly even therapeutic options. These predictions may result in a significant therapeutic paradigm shift by assisting in the selective administration of adjuvant chemotherapy and radiotherapy to the patient subsets who are actually at risk rather than treating the majority of patients to help a few. It may even be possible to predict which patients will actually benefit from extirpative surgical procedures, such as the Whipple procedure for pancreas cancer (decision based on the survival benefit) or mastectomy for the patient with an axillary metastasis but no primary lesion detected in the breast (decision based on accurate diagnosis of occult breast cancer). Finally, gene-expression profiles may be used to predict the clinical response to both conventional and targeted therapeutics. It is equally likely that these technologies will provide us with more information than we can currently exploit to the patient’s advantage if selective and effective chemotherapeutic agents are not yet developed at a similar pace. The future of cancer management will likely change for the better with the incorporation of microarray gene chips but will still provide us with significant challenges that must be addressed by physicians and scientists in an interdisciplinary fashion. The surgeon’s role in this process will be instrumental.

APPENDIX: GLOSSARY OF TERMS

Allele: Two or more alternative forms of a gene resulting in different gene products and, thus, different phenotypes. In a haploid set of chromosomes, there is only one allele at its specific locus. Diploid organisms have two alleles at a given locus, i.e., a normal and a mutant allele. A single allele for each gene locus is inherited separately from each parent (e.g., at a locus for eye color, the allele might result in blue or brown eyes). An organism is homozygous for a gene if the alleles are identical and is heterozygous if they are different.
Annotation: The process of identifying, naming, and classifying genes. When novel gene sequences are discovered, they are usually identified, classified, and annotated on the basis of aggregate measures of sequence similarity.
Bioinformatics: A new discipline encompassing genetics, molecular biology, computational biology, and database construction.
CpG Island: Short region of DNA in which the frequency of the CG sequence is higher than in other regions. The "p" indicates that "C" and "G" are connected by a phosphodiester bond. CpG islands are often located around the promoters of housekeeping genes (which are essential for general cell functions) or other genes frequently expressed in a cell.
EST: Expressed sequence tag. Short (200 to 500 base pairs) sequence of genomic DNA that has a single occurrence in the human genome and whose location and base sequence are known. Detectable by polymerase chain reaction, STSs are useful for localizing and orienting the mapping and sequence data reported from many different laboratories and serve as landmarks on the developing physical map of the human genome.
Gene: An ordered sequence of nucleotides located in a particular position (locus) on a particular chromosome that encodes a specific functional product (the gene product, i.e., a protein or RNA molecule). It includes regions involved in regulation of expression and regions that code for a specific functional product.
Genetic polymorphism: The occurrence together in the same population of more than one allele or genetic marker at the same locus, with the least frequent allele or marker occurring more frequently than can be accounted for by mutation alone.
Genome: The complete set of genes for a particular species. All the genetic material in the chromosomes of a particular organism; its size is generally given as its total number of base pairs.
Genotype: The heritable information contained in an individual.
Homolog: Homologous genes can be separated into two classes: orthologs and paralogs. Orthologs are homologous genes that have diverged in sequence because of evolutionary separation between species; paralogs are homologous genes within a species that are the result of a gene duplication event within the lineage. The study of orthologs is of particular importance because it is assumed that these genes play similar developmental or physiological roles and, consequently, should share conserved functional and regulatory domains.
Line: Long interspersed element; a type of large repetitive DNA segment found throughout the genome of eukaryotes.
Oligonucleotide: A short fragment of single-stranded DNA, typically 5 to 50 nucleotides.
Ortholog: Genes of similar sequence structure across eukaryotic species.
Pharmacogenomics: Pharmacogenetics and pharmacogenomics deal with the genetic basis underlying variable drug responses in individual patients.
Proteome: The complete set of expressed genes (expressed as proteins) for a particular species.
SINE: Short interspersed element. A type of small, dispersed, repetitive DNA sequence (e.g., the Alu family in the human genome) found throughout a eukaryotic genome.
SNP: Single nucleotide polymorphism; sequence polymorphism differing in a single base pair.
Transcriptome: The complete set of transcribed genes (expressed as messenger RNAs) for a particular species.

Acknowledgments

The appendix and acknowledgments are available online at www.annalssurgicaloncology.org.

The author thanks John Quackenbush, PhD, at The Institutes for Genomic Research, Rockville, MD, for his helpful discussions in the preparation of this manuscript. Supported by The Director’s Challenge Grant UO-1 CA85052-01A1 and CA85429-01.

Footnotes

Molecular biology will soon meet clinical medicine head on. This review describes the potential for gene-expression profiling to assess diagnosis and prognosis and even predict therapy on the basis of the analysis of every tumor by a single microarray chip.

Received for publication May 29, 2002. Accepted for publication September 30, 2002.

REFERENCES

  1. Lewin R. Proposal to sequence the human genome stirs debate. Science 1986; 232: 1598–600.[Free Full Text]
  2. Lander ES, Linton LM, Birren B, et al. Initial sequencing and analysis of the human genome. Nature 2001; 409: 860–921.[CrossRef][Medline]
  3. Venter JC, Adams MD, Myers EW, et al. The sequence of the human genome. Science 2001; 291: 1304–51.[Abstract/Free Full Text]
  4. Saha S, Sparks AB, Rago C, et al. Using the transcriptome to annotate the genome. Nat Biotechnol 2002; 20: 508–12.[CrossRef][Medline]
  5. Adams MD, Celniker SE, Holt RA, et al. The genome sequence of Drosophila melanogaster. Science 2000; 287: 2185–95.[Abstract/Free Full Text]
  6. Genome sequence of the nematode C. elegans: a platform for investigating biology. The C. elegans Sequencing Consortium. Science 1998; 282: 2012–8.[Abstract/Free Full Text]
  7. Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature 2000; 408: 796–815.[CrossRef][Medline]
  8. Webb T. SNPs: can genetic variants control cancer susceptibility? J Natl Cancer Inst 2002; 94: 476–8.[Free Full Text]
  9. Relling MV, Dervieux T. Pharmacogenetics and cancer therapy. Nat Rev Cancer 2001; 1: 99–108.[CrossRef][Medline]
  10. Srinivas PR, Kramer BS, Srivastava S. Trends in biomarker research for cancer detection. Lancet Oncol 2001; 2: 698–704.[CrossRef][Medline]
  11. Brock GJ, Huang TH, Chen CM, Johnson KJ. A novel technique for the identification of CpG islands exhibiting altered methylation patterns (ICEAMP). Nucleic Acids Res 2001; 29: E123.
  12. Yan PS, Wei SH, Huang TH. Differential methylation hybridization using CpG island arrays. Methods Mol Biol 2002; 200: 87–100.[Medline]
  13. Suzuki H, Gabrielson E, Chen W, et al. A genomic screen for genes upregulated by demethylation and histone deacetylase inhibition in human colorectal cancer. Nat Genet 2002; 31: 141–9.[CrossRef][Medline]
  14. Su AI, Cooke MP, Ching KA, et al. Large-scale analysis of the human and mouse transcriptomes. Proc Natl Acad Sci U S A 2002; 99: 4465–70.[Abstract/Free Full Text]
  15. Quackenbush J. Computational analysis of microarray data. Nat Rev Genet 2001; 2: 418–27.[CrossRef][Medline]
  16. Ideker T, Galitski T, Hood L. A new approach to decoding life: systems biology. Annu Rev Genomics Hum Genet 2001; 2: 343–72.[CrossRef][Medline]
  17. Davidson EH, Rast JP, Oliveri P, et al. A genomic regulatory network for development. Science 2002; 295: 1669–78.[Abstract/Free Full Text]
  18. Graham TA, Weaver C, Mao F, et al. Crystal structure of a beta-catenin/Tcf complex. Cell 2000; 103: 885–96.[CrossRef][Medline]
  19. Sperling K. From proteomics to genomics. Electrophoresis 2001; 22: 2835–7.[CrossRef]
  20. Klose J. Protein mapping by combined isoelectric focusing and electrophoresis of mouse tissues. A novel approach to testing for induced point mutations in mammals. Humangenetik 1975; 26: 231–43.[Medline]
  21. O’Farrell PH. High resolution two-dimensional electrophoresis of proteins. J Biol Chem 1975; 250: 4007–21.[Abstract/Free Full Text]
  22. Scheele GA. Two-dimensional gel analysis of soluble proteins. Characterization of guinea pig exocrine pancreatic proteins. J Biol Chem 1975; 250: 5375–85.[Abstract/Free Full Text]
  23. Anderson L, Seilhamer J. A comparison of selected mRNA and protein abundances in human liver. Electrophoresis 1997; 18: 533–7.[CrossRef][Medline]
  24. Lee KH. Proteomics: a technology-driven and technology-limited discovery science. Trends Biotechnol 2001; 19: 217–22.[CrossRef][Medline]
  25. Agrawal D, Chen T, Irby R, et al. Osteopontin identified as lead marker of colon cancer progression, using pooled sample expression profiling. J Natl Cancer Inst 2002; 94: 513–21.[Abstract/Free Full Text]
  26. Giordano TJ, Shedden KA, Schwartz DR, et al. Organ-specific molecular classification of primary lung, colon, and ovarian adenocarcinomas using gene expression profiles. Am J Pathol 2001; 159: 1231–8.[Abstract/Free Full Text]
  27. Ramaswamy S, Tamayo P, Rifkin R, et al. Multiclass cancer diagnosis using tumor gene expression signatures. Proc Natl Acad Sci U S A 2001; 98: 15149–54.[Abstract/Free Full Text]
  28. Alizadeh AA, Ross DT, Perou CM, van de Rijn M. Towards a novel classification of human malignancies based on gene expression patterns. J Pathol 2001; 195: 41–52.[CrossRef][Medline]
  29. Su AI, Welsh JB, Sapinoso LM, et al. Molecular classification of human carcinomas by use of gene expression signatures. Cancer Res 2001; 61: 7388–93.[Abstract/Free Full Text]
  30. Bhattacharjee A, Richards WG, Staunton J, et al. Classification of human lung carcinomas by mRNA expression profiling reveals distinct adenocarcinoma subclasses. Proc Natl Acad Sci U S A 2001; 98: 13790–5.[Abstract/Free Full Text]
  31. Shipp MA, Ross KN, Tamayo P, et al. Diffuse large B-cell lymphoma outcome prediction by gene-expression profiling and supervised machine learning. Nat Med 2002; 8: 68–74.[CrossRef][Medline]
  32. Takahashi M, Rhodes DR, Furge KA, et al. Gene expression profiling of clear cell renal cell carcinoma: gene identification and prognostic classification. Proc Natl Acad Sci U S A 2001; 98: 9754–9.[Abstract/Free Full Text]
  33. West M, Blanchette C, Dressman H, et al. Predicting the clinical status of human breast cancer by using gene expression profiles. Proc Natl Acad Sci U S A 2001; 98: 11462–7.[Abstract/Free Full Text]



This article has been cited by other articles:


Home page
Clin. Cancer Res.Home page
A. Cesario, A. Catassi, L. Festi, A. Imperatori, A. Pericelli, D. Galetta, S. Margaritora, V. Porziella, V. Cardaci, P. Granone, et al.
Farnesyltransferase Inhibitors and Human Malignant Pleural Mesothelioma: A First-Step Comparative Translational Study
Clin. Cancer Res., March 1, 2005; 11(5): 2026 - 2037.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Yeatman, T. J.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Yeatman, T. J.


HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS