The Samuel Roberts Noble Foundation, Inc.

Virus Evolution Workgroup: 1999 Workshop Abstract

1999 Workshop Abstracts | Virus Evolution Home Page | Plant Biology Home Page

Drift and conservatism in RNA virus evolution

Simon Wain-Hobson
Unité de Rétrovirologie Moléculaire, Institut Pasteur, 28 rue du Dr. Roux
75724 Paris cedex 15

There is no such thing as a perfect machine. Accordingly nucleic acid polymerisation is inevitably error prone. Yet by their notoriety and abundance RNA viruses are highly successful intracellular parasites. Indeed some estimates suggest that 80% of viruses have RNA genomes. It follows that replication without proof reading can be a successful strategy. There is a price to pay, however. Manfred Eigen was the first to point out that without proof reading there is a limit on the size of RNA genomes. Obviously if the mutation rate is too high then any RNA virus will collapse under mutation pressure. As it happens RNA viral genomes are up to 32 kb long while mutation rates are less than or equal to 1-2 per genome per cycle.

Possibly RNA viruses and retroviruses simply have not invested in proof reading in which case mutations represent an inevitable genetic noise, to be tolerated or eliminated. Hence there would be no loss of fitness, fixed mutations being neutral. A corollary of this would be that the intrinsic life style of a virus is set in its genes. The alternative is to suppose that most fixed mutations are beneficial to the virus allowing it to keep ahead of the host and/or host population. By this token variation is an integral part of the viral modus vivendi. The twin requirements of a successful virus are replication and transmission. Under the rubric replication a virus could vary to increase its fitness, exploit different target cells or evade adaptive immune responses. In terms of transmission variation might allow a virus to overcome herd immunity.

These two scenarios emphasize the two sides of the molecular evolution debate; the former highlights neutrality while the latter puts a premium on positive selection. Purifying, or negative, selection is ever operative - a poor replicon invariably goes asunder. Through rounds of error and trial, positive selection is the only means of creating a novel replicon. So long as the ecological niche occupied doesn’t change the virus doesn’t need to change, purifying selection being sufficient to ensure existence. This raises an important issue: Ernst Mayr noted that "the brain of 100,000 years ago is the same brain that is now able to design computers" (Mayr, 1997). Positive fitness selection among mammals is effectively inoperative over our lifetimes. And certainly since we have known about HIV and AIDS.

How is it that vertebrates, invertebrates, plants, fungi and bacteria, all species with a low genomic mutation rates can control viruses which mutate so much faster - sometimes by a factor of 106? (Domingo et al., 1996; Gojobori and Yokoyama, 1985; Holland et al., 1982). Yet they do. We come to the basic question - to what extent is genetic variation exploited by a RNA virus if at all? And if so, what is the virus adapting to? The answer that is invariably given to the second question is the adaptive immune system (Seibert et al., 1995). Yet apart from the vertebrates none of the other groups mentioned above mount antigen specific immune responses. This chapter will argue that the majority of fixed mutations are neutral.

Diversity calculations - virological mayhem
The two rates touted by evolutionary minded virologists are the mutation rate and the mutation fixation rate. The first describes the rate of genesis of mutation, the second an attempt to describe their fixation within the population sampled over a period of time. In the exceptional case where all substitutions are neutral the mutation rate (m) equals the fixation rate (f). If fixation rates are measured over the space of one year then f = n•m where n is the annual number of consecutive rounds of replication. It appears that such a situation applies to the evolution of parts of the SIV and HIV-1 genomes over the space of 1-3 years (Pelletier et al., 1995; Plikat et al., 1997). It is simple to show that several hundred rounds of sequential replication are required (Pelletier et al., 1995; Wain-Hobson, 1993). Given that the proviral load of a HIV-1 positive patient (~107-109) changes by less than a factor of 10 over greater than or equal to 5 years, and the assumption that an infected cell produces sufficient virus to generate two productively infected cells then annual production would be something akin to 2200, or 1060, which is impossible. Clearly the productive burst size of 2 is too large (Wain-Hobson, 1993; Wain-Hobson, 1993). This must be reduced to 1.1 to achieve a realistic proviral load (1.1200 ~108). Note that the real value for the effective burst size must be even lower as proviral load is turning over more slowly than once a day. Yet to explain the temporal increase proviral load the productive burst size must be greater than or equal to 2. Thus the calculation reveals massive destruction of infected cells, precisely to be expected from immensely powerful innate and adaptive immune responses.

In the situation where purifying selection is in evidence some additional factor must be introduced to couple the fixation and mutation rates. As the accumulation of most substitutions proceeds in a protein specific linear manner for small degrees of divergence, the above equation can be modified to f = P•n•m, where 1>P>0 is a constant indicating the degree of negative selection. Note immediately that if P<1 then more rounds of replication are needed to produce the same percent amino acid fixation. A corollary is an even greater degree of destruction of infected cells. Consider the example of a virus which is fixing substitutions only slowly, ~10-5 per site per year, something like the Ebola virus glycoprotein. The mutation rate for Ebola is not known but is probably around 10-4 per site per cycle (Drake, 1993). Hence P•n ~ 10-1. What is the value of n? Most mammalian viruses replicate within 24 hours while obviously outside of a body they do not replicate. Consequently a value of n = 50-200 is probably not unreasonable. Accordingly P ~ 2 10-3 to 5 10-4. This means that most mutations generated are deleterious. Of those that are fixed most are neutral as has been discussed above.

The last two sentences describe a profoundly conservative strategy - RNA viruses are seen merely to replicate far more than giving rise to genetically distinct, even exotic, siblings. What a stultifying picture, in contrast to the shock-horror of tabloid newspaper virology and that atmospheric, yet profoundly ambiguous term, emerging viruses.

It is interesting that in a few areas of RNA virology much has been made of escape from the adaptive immune response, particularly cytotoxic T lymphocytes, so leading to persistence (McMichael and Phillips, 1997; Nowak and McMichael, 1995). However, it is not at all obvious that this be the case (Wain-Hobson, 1996). It must not be forgotten that it is possible to vaccinate against a number of RNA viruses such as measles, polio and yellow fever. Be that as it may, many DNA viruses, intracellular bacteria and parasites persist. In these cases de novo genetic variation arising from point mutations is too slow a means to thwart an adaptive immune response. For example, after 1700 generations, under experimental conditions whereby Muller’s ratchet was operative, S. typhimurium accumulated mutations such that 1% of the 444 lineages tested had suffered an obvious loss of fitness (Andersson and Hughes, 1996). This number of generations could be achieved within as little as 45 days giving an idea as to the time necessary to generate a mutation affecting fitness. This is more than enough time to make a vigorous immune response. Some inklings of immune system escape for the herpes virus EBV (de Campos-Lima et al., 1993; de Campos-Lima et al., 1994) came to nought (Burrows et al., 1996; Khanna et al., 1997). When antigenic variation is in evidence among DNA based microbes it invariably results from the use of cassettes and multicopy genes rather than point mutations resulting from DNA replication. And of course such complex systems could have only come about by natural selection.

Finally de novo genetic variation of a RNA virus has never been suggested or shown to be necessary for the course of an acute infection. For a virus to persist thanks to genetic variation the phenomenon of epitope escape must be strongly in evidence by the time of seroconversion, generally 5-6 weeks. Yet such data is not forthcoming, and not for want of trying. When viruses do play tricks with the immune system it is invariably by way of specific viral gene products that interfere with the mechanics of adaptive and innate immunity (Ploegh, 1998). In the clear cases where genetic variation is exploited by RNA viruses it is used to overcome barriers to transmission set up by the host population, e. g. herd immunity, and not to replication within a host. The obvious example is influenza A virus antigenic variation in mammals.

Many ways to be a virus
John Maynard Smith’s argument was simply put. Noting that for organisms with a base substitution rate of <1 per genome per cycle all intermediates linking any two sequences must be viable otherwise the lineage would go extinct. The example used was self explanatory: WORD Ÿ WORE Ÿ GORE Ÿ GONE Ÿ GENE (Maynard Smith, 1970). The same is true for viruses even though their mutation rates are 6 orders higher; the rate for a given protein is still <1 per cycle. Even for rather stable viruses like Ebola/Marburg and HTLV-1/-2 the number of intermediates are huge. While sequence space is basically impossible to comprehend the amount accessible to a virus remains huge. For the lineage to exist the probability of finding a viable mutant must be greater than or equal to 1/population size within the host.

Are viruses optimized?
The molecular biologist frequently thinks like an engineer who can redesign from scratch. Yet replicons have been constrained by a series of historical events representing variations on a founding theme. While they are fit enough to survive are they the best possible? This question is salutary for we live in a society that is more and more competitive and, thanks to global communications, knows about the most successful athletes or businessmen world wide. Yet who can remember the name of any Olympic athlete who came in fourth? Fourth best in any large population is remarkable.

How good are viruses as machines? Once again let us look at some examples from HIV-1. Reverse transcription feeds on cytoplasmic dNTPs. Yet by supplementing the culture milieu with deoxycytidine - which is scavenged and phosphorylated to the triphosphate - virus replication was substantially increased (Meyerhans et al., 1994). It is known that good expression of a foreign protein is frequently compromised by inappropriate codon usage. By redesigning codon usage of the jellyfish (Aequorea victoria) green fluorescent protein gene to correspond to that typical of mammalian genes, greatly improved expression was achieved in mammalian cells (Haas et al., 1996). The same group engineered codon usage of the HIV-1 gp120 glycoprotein gene segment to correspond to that of the abundantly expressed human Thy-1 surface antigen. Expression was greatly improved (Haas et al., 1996). The coup de grace came with the reciprocal experiment - engineering Thy-1 gene codon usage to correspond to that of gp120. Thy-1 surface expression was greatly reduced (Haas et al., 1996). It has been known since the first HIV-1 sequence that its codon usage was highly biased (Bronson and Anderson, 1994; Wain-Hobson et al., 1985). Something is clearly operating which overrides maximal envelope expression. Furthermore, gp120 codon usage is similar for all other HIV-1 genes whether they be structural or regulatory. For that matter, codon usage is comparable for most lentiviruses (Bronson and Anderson, 1994; Sonigo et al., 1985).

It was possible to show via DNA vaccination that codon engineered gp120 elicited stronger immune responses in mice than the normal counterpart (Andre et al., 1998). Might this finding suggest that the optimum is actually away from mass production? Yet if there is a shadow of reality in this thesis, it indicates that fitness optima in vivo may not necessarily parallel the expectations of fitness based on ex vivo models. In this context note also that human T cell leukemia [retro]viruses type 1 infects exactly the same cell as HIV yet its codon usage is very different to that of HIV and the Thy-1 gene (Seiki et al., 1983).

If fitness optimisation was ever operative in vivo then one would predict steady increases in virulence for those viruses that do not set up heard immunity. At some point a plateau would be reached. Yet the higgledy-piggledy way by which virulent strains come and go suggest that this is not so.

An origin of viral species?
A virus must replicate sufficiently within a host to permit infection of another susceptible host. If the new host is of the same species differences between the two are minimal - a small degree of polymorphism being inevitable in outbred populations. Given that viruses with a small coding capacity interact particularly intimately with the host cell machinery, it follows that infection of a host from a related species has a greater probability of succeeding if the cellular machinery is comparable. Indeed, the closer the two species the greater the probability. In turn, if the virus gets a toehold and can generate a quasispecies then only few mutations would probably be necessary to adapt to the new niche.

Yet species is a difficult word. What might a viral species be? Martin (1993) wrote a fascinating review on the number of extinct primate species estimated from the fossil record. Depending on the emergence time of primates of modern aspect he was able to estimate that the total number to have existed as 5500-6500. The present number of 200 primates species would thus represent ~3.4-3.8%. More importantly from our viewpoint was his calculation of the average survival time of fossil primate species as a mere 1 million years (Martin, 1993). Given that RNA viruses are fixing mutations approximately one million times faster than mammals (Domingo et al., 1996; Gojobori and Yokoyama, 1985; Holland et al., 1982), a viral species would become extinct approximately after one year! Immediately the annual influenza A strain comes to mind. Yet rabies, polio and HTLV-1 have arguably been around for millennia. Clearly the word species when taken from primatology cannot apply to the viral world. Frogs provide a more interesting example. They have been around for several hundred millions of years and members of some lineages can interbreed despite 75 million years separation. Naturally, their protein sequences have not stood still during that time (Wilson et al., 1977). Enough is conserved to allow breeding. Maybe the primate picture has undue weight in our appreciation of virology. Phenotype can be maintained despite changes in genotype - obvious to a biologist.

As usual Holland wasn’t far from the mark when he wrote "As human populations continue to grow exponentially, the number of ecological niches for human RNA virus evolution grows apace and new human virus outbreaks will likely increase apace. Most new human viruses will be unremarkable - that is they will generally resemble old ones. Inevitably, some will be quite remarkable, and quite undesirable. When discussing RNA virus evolution, to call an outbreak (such as AIDS) remarkable is merely to state that it is of lower probability than an unremarkable outbreak" (Holland et al., 1992). New viruses can and do emerge but on a scale that is probably fifteen to twenty logs less than the number of viral mutants generated up to that defining moment (Wain-Hobson, 1993). They will result from a small number of mutations and a dose of reproductive isolation.

Conclusion
The above has attempted to show that the vast majority of genetic changes fixed by RNA viruses are essentially neutral or nearly neutral in character. Positive selection exploits a small proportion of genetic variants, while functional sequence space is sufficiently dense allowing viable solutions to be found. Although evolution has connotations of change what has always counted is natural selection or adaptation. It is the only force for the genesis of a novel replicon. Once adapted to its niche there is no need to change. In such circumstances a RNA virus would no longer be adapting, even though it could be changing.

Why is the evolution of RNA viruses so conservative? Why do they mutate rapidly yet remain phenotypically stable? The lack of proof reading proscribes the genesis of large genomes restricting their genomes sizes to one log. Among the smallest RNA and retroviruses are MS2 and hepatitis B virus, both ~3 kb, while the largest are the coronaviruses at less than or equal to 32 kb. Most of their proteins are structural or regulatory and take up the largest part of the coding capacity of the virus. Additional proteins broadening the range of interactions with the host cell, or rendering the replicon more autonomous, are relatively few. Large, gene-sized duplications that may contribute to diversification and novel phenotypes are rare, reducing the exploration of new horizons. Thus, RNA viruses evolution is probably conservative because they cannot shuffle domains so generating new combinations.

That the information capacity of RNA viral genomes is limited by a lack of proof reading is neither here nor there for they are remarkably successful parasites. RNA viruses change far more than they adapt.

Acknowledgements
We would like to thank past and present members of the laboratory and numerous colleagues for endless discussions over the years. This laboratory is supported by grants from the Institut Pasteur and l’Agence Nationale pour la Recherche sur le SIDA.

 

References
Andersson, D. I., and Hughes, D. (1996). Muller’s ratchet decreases fitness of a DNA-based microbe. Proc. Natl. Acad. Sci. USA 93, 906-907.

Andre, S., Seed, B., Eberle, J., Schraut, W., Bultmann, A., and Haas, J. (1998). Increased immune response elicited by DNA vaccination with a synthetic gp120 sequence with optimized codon usage. J. Virol. 72, 1497-1503.

Bronson, E. C., and Anderson, J. N. (1994). Nucleotide composition as a driving force in the evolution of retroviruses. J. Mol. Evol. 38, 506-532.

Burrows, J. M., Burrows, S. R., Poulsen, L. M., Sculley, T. B., Moss, D. J., and Khanna, R. (1996). Unusually high frequency of Epstein-Barr virus genetic variants in Papua New Guinea that can escape cytotoxic T-cell recognition: implications for virus evolution. J. Virol. 70, 2490-2496.

de Campos-Lima, P. O., Gavioli, R., Zhang, Q. J., Wallace, L. E., Dolcetti, R., Rowe, M., Rickinson, A. B., and Masucci, M. G. (1993). HLA-A11 epitope loss isolates of Epstein-Barr virus from a highly A11+ population. Science 260, 98-100.

de Campos-Lima, P. O., Levitsky, V., Brooks, J., Lee, S. P., Hu, L. F., Rickinson, A. B., and Masucci, M. G. (1994). T cell responses and virus evolution: loss of HLA A11-restricted CTL epitopes in Epstein-Barr virus isolates from highly A11-positive populations by selective mutation of anchor residues. J. Exp. Med. 179, 1297-1305.

Domingo, E., Escarmis, C., Sevilla, N., Moya, A., Elena, S. F., Quer, J., Novella, I. S., and Holland, J. J. (1996). Basic concepts in RNA virus evolution. FASEB J. 10, 859-864.

Drake, J. W. (1993). Rates of spontaneous mutations among RNA viruses. Proc. Natl. Acad. Sci. USA 90, 4171-4175.

Gojobori, T., and Yokoyama, S. (1985). Rates of evolution of the retroviral oncogene of Moloney murine sarcoma virus and of its cellular homologues. Proc. Natl. Acad. Sci. USA 82, 4198-4201.

Haas, J., Park, E. C., and Seed, B. (1996). Codon usage limitation in the expression of HIV-1 envelope glycoprotein. Curr. Biol. 6, 315-324.

Holland, J., Spindler, K., Horodyski, F., Grabau, E., Nichol, S., and Vande Pol, X. (1982). Rapid evolution of RNA genomes. Science 215, 1577-1585.

Holland, J. J., de le Torre, J. C., and Steinhauer, D. A. (1992). RNA virus populations as quasispecies. Curr. Top. Microbiol. Immunol. 176, 1-20.

Khanna, R., Burrows, S. R., and Burrows, J. M. (1997). The role of cytotoxic T lymphocytes in the evolution of genetically stable viruses. Trends Microbiol. 5, 64-69.

Martin, R. D. (1993). Primate origins: plugging the gaps. Nature 363, 223-234.

Maynard Smith, J. (1970). Natural selection and the concept of a protein space. Nature 225, 563-564.

Mayr, E. (1997). This is Biology (Cambridge, MA: The Belknap Press of The Harvard University Press).

McMichael, A. J., and Phillips, R. E. (1997). Escape of human immunodeficiency virus from immune control. Annu. Rev. Immunol. 15, 271-296.

Meyerhans, A., Vartanian, J. P., Hultgren, C., Plikat, U., Karlsson, A., Wang, L., Eriksson, S., and Wain-Hobson, S. (1994). Restriction and enhancement of human immunodeficiency virus type 1 replication by modulation of intracellular deoxynucleoside triphosphate pools. J. Virol. 68, 535-540.

Nowak, M. A., and McMichael, A. J. (1995). How HIV defeats the immune system. Scient. Am. 273, 58-65.

Pelletier, E., Saurin, W., Cheynier, R., Letvin, N. L., and Wain-Hobson, S. (1995). The tempo and mode of SIV quasispecies development in vivo calls for massive viral replication and clearance. Virology 208, 644-652.

Plikat, U., Nieselt-Struwe, K., and Meyerhans, A. (1997). Genetic drift can dominate short-term human immunodeficiency virus type 1 nef quasispecies evolution in vivo. J. Virol. 71, 4233-4240.

Ploegh, H. L. (1998). Viral strategies of immune evasion. Science 280, 248-253.

Seiki, M., Hattori, S., Hirayama, Y., and Yoshida, M. (1983). Human adult T-cell leukemia virus: complete nucleotide sequence of the provirus genome integrated in leukemia cell DNA. Proc. Natl. Acad. Sci. USA 80, 3618-3622.

Sonigo, P., Alizon, M., Staskus, K., Klatzmann, D., Cole, S., Danos, O., Retzel, E., Tiollais, P., Haase, A., and Wain-Hobson, S. (1985). Nucleotide sequence of the visna lentivirus: relationship to the AIDS virus. Cell 42, 369-382.

Wain-Hobson, S. (1993). The fastest genome evolution ever described: HIV variation in situ. Current Opinion in Genetics & Development 3, 878-883.

Wain-Hobson, S. (1993). Viral burden in AIDS. Nature 366, 22.

Wain-Hobson, S. (1996). Running the gamut of retroviral variation. Trends Microbiol. 4, 135-141.

Wain-Hobson, S., Sonigo, P., Danos, O., Cole, S., and Alizon, M. (1985). Nucleotide sequence of the AIDS virus, LAV. Cell 40, 9-17.

Wilson, A. C., Carlson, S. S., and White, T. J. (1977). Biochemical evolution. Ann. Rev. Biochem. 46, 573-639.

Abstract - Presented at the Virus Evolution Workshop
Ardmore, OK
October 21 - 24th, 1999

 

Virus Evolution Workshop - Main Page
Poster Presentations

 

To contact the organizers:
e-mail: mroossinck@noble.org

Dr. Marilyn Roossinck
Plant Biology Division
The Noble Foundation
P.O. Box 2180
Ardmore, OK 73402

phone: 580 224-6630