The Samuel Roberts Noble Foundation, Inc.

Virus Evolution Workgroup: 1999 Workshop Abstract

Workshop Abstracts | Virus Evolution Home Page | Plant Biology Home Page

Evolutionary dynamics of P gene overlapping reading frames in the paramyxoviridae and rhabdoviridae

I. King Jordan1, Ben A. Sutter IV2 and Marcella A. McClure1
1 Department of Microbiology and Center for Computational Biology,
Montana State University, Bozeman, MT USA
2 Department of Microbiology, College of Physicians and Surgeons,
Columbia University, New York, NY USA

Paramyxoviridae and Rhabdoviridae are families of negative strand RNA viruses belonging to the order Mononegavirales. Presented here is an analysis of the molecular evolutionary dynamics of the P gene among 76 representative sequences of the Paramyxoviridae and Rhabdoviridae RNA virus families. The P gene is distinguished by the fact that, in a number of Paramyxoviridae taxa, it encodes multiple gene products from overlapping reading frames. In VSV of the Rhabdoviridae family, the P gene also encodes multiple gene products from a single genomic template. Paramyxoviridae and Rhabdoviridae P gene encoded products include the phosphoprotein (P), as well as the C and V proteins. The complexity of the P gene makes it an intriguing locus to study from an evolutionary perspective. The presence of overlapping reading frames provides the opportunity to examine de novo molecular evolution of unique coding sequences.

To evaluate the steps involved in the evolution of P gene overlapping reading frames, the coding capacities were mapped most-parsimoniously onto the phylogeny of the Paramyxoviridae taxa. Coding capacity for a single P protein is the ancestral state of the P gene and multiple coding capacities that include combinations of the P, C and V proteins are derived. The two most parsimonious paths of P, C and V protein coding capacity evolution along the Paramyxoviridae tree each involve seven steps. The number of gains versus losses that constitute the seven steps distinguishes these two paths. One path involves a gain of both C and V protein coding capacity along the same lineage and requires four gains and three losses of P gene coding capacity. The other path requires two independent gains of C protein coding capacity and consists of five gains versus two losses. The latter path can be considered to be less likely because of the higher number of P gene coding capacity gains and the two independent gains of C ORFs.

Relative levels of amino acid sequence identity for polypeptide sequences encoded by the overlapping reading frames were compared among all P-, C- and V- protein encoding Paramyxoviridae taxa, as well as the Rhabdoviridae VSV taxa. Comparisons were made between the regions of the ancestral P protein and the derived regions of C and V encoded by the same nucleotides from overlapping reading frames. Surprisingly, proteins encoded in overlapping reading frames from the same nucleotides were found to have different levels of amino acid variation. For example, levels of P, C and V amino acid variation within and between the Paramyxovirus and Morbillivirus genera revealed that in all cases C and V amino acid sequences were more conserved than the P amino acid sequences encoded by the same nucleotides in an overlapping reading frame. For VSV, C amino acid sequences are less conserved than the corresponding P amino acid sequences.

To better understand the broad patterns of comparative amino acid identity for the regions of the P gene-encoded polypeptides described above, the nucleotide changes that underlie this variation were examined and the nature of selective forces that have acted on P gene overlapping reading frames were evaluated. Six clades of viral taxa that included enough closely related sequences (avoiding the problem of nucleotide saturation) to obtain meaningful results were analyzed: the Sendai clade, the human parainfluenza-3 clade, the measles clade,the phocine/canine distemper clade, the VSV-IN clade and the VSV-NJ clade. To evaluate the effects of selection on these regions, the ratio of nonsynonymous (dn) to synonymous (ds) nucleotide diversity was calculated. A ratio below one (a relative excess of synonymous substitutions) is indicative of negative selection, presumably due to functional constraints on the protein sequence. The action of selection was also assessed by examining the distribution of variable nucleotide sites across the three codon positions for overlapping reading frames. These measures of nucleotide variation revealed that for every case, the evolution of one of the proteins encoded in the overlapping reading frames has been constrained by negative (purifying) selection while the protein encoded in the other frame has evolved more rapidly. For the Paramyxoviridae the integrity of the overlapping reading frame that represents a derived state is generally maintained at the expense of the ancestral reading frame encoded by the same nucleotides. For example, all Paramyxoviridae intraclade comparisons show that the C ORF is severely constrained by negative selection while the P ORF encoded by the same nucleotides is much less constrained. An exception to this pattern can be seen for the V ORF of the human parainfluenza-3 clade. In this clade, the ancestral P ORF is constrained by negative selection while the derived V ORF encoded by the same nucleotides is not. This case likely represents a recent secondary loss of V coding capacity. In Rhabdoviridae, both VSV clades have patterns of nucleotide variation that indicate conservation of the P reading frame by negative selection and relatively rapid evolution of the C frame encoded by the same nucleotides.

The evolution of overlapping reading frames in the Paramyxoviridae and Rhabdoviridae P genes is likely a response to selective pressure to maximize genomic information content while maintaining a small genome. The ability to adopt such a complex genomic strategy is intimately related to RNA virus quasispecies dynamics. RNA virus quasispecies dynamics are characterized by accelerated rates of evolution due to high population numbers, rapid replication of RNA viruses and the error prone nature of RNA genome replication. The distribution of variants in the quasispecies that results from the high mutation rate includes a large reservoir of variants with novel and potentially successful phenotypes. RNA viruses therefore may have an enhanced ability to explore adaptive landscapes. This enhanced adaptability can in turn lead to the evolution of novel and complex genomic strategies. The evolution of overlapping coding sequences is an example of such a complex genomic strategy. Thus elucidation of the molecular evolutionary dynamics of the Paramyxoviridae and Rhabdoviridae P gene multi-coding sequences illustrates one example of the profound effect of quasispecies population dynamics on RNA virus biology.

 

Abstract - Presented at the Virus Evolution Workshop
Ardmore, OK
October 21 - 24th, 1999

 

Virus Evolution Workshop - Main Page
Poster Presentations

 

To contact the organizers:
e-mail: mroossinck@noble.org

Dr. Marilyn Roossinck
Plant Biology Division
The Noble Foundation
P.O. Box 2180
Ardmore, OK 73402

phone: 580 224-6630