Molecular evolution of herpes simplex virus 2 complete genomes: Comparison between primary and recurrent infections Miguel Minaya1*, Travis Jensen2, Johannes Goll2, Maria Korom1, Sree H. Datla1, Robert B. Belshe3, and Lynda A. Morrison1,3 1 Department
of Molecular Microbiology and Immunology, Saint Louis University School of Medicine, St. Louis, Missouri, USA 2 The EMMES Corporation, Rockville, Maryland, USA 3 Department of Internal Medicine, Saint Louis University School of Medicine, St. Louis, Missouri, USA *
[email protected] Importance
1. Organizational chart
2. Complete HSV-2 genomes
This research presents for the first time a comparison of whole herpes simplex virus
Subject A: sample 08 (primary) and sample 14 (5th episode) Subject B: sample 16 (primary) and sample 19 (6th episode)
type 2 (HSV-2) genome sequences of primary isolates and isolates from later recurrent
Expand isolates once in Vero cells
disease episodes. HSV-2 are large, double-stranded DNA viruses that
cause
lifelong
persistent
DNA extraction
infections
Illumina Hi-Seq sequencing
characterized by periods of quiescence and recurrent disease.
De novo contig assembly
The extent to which the HSV-2 genome evolves Align reads against HG52
during multiple episodes of reactivation from its latent state within an infected individual is not known. Next Generation Sequencing (NGS) techniques were used to determine whole genome sequences of four viral samples: two from primary isolates, and
Sanger sequencing of regions with poor read coverage (read depth < 25%)
Novoalign
Manually adjust alignments of Sanger sequences + Novoalign consensus sequences
Select Novoalign as the most plausible alignment
Map to HG52 Bowtie2
Draft viral genome >97% coverage
two from recurrent isolates. 2. Complete HSV-2 genomes for samples 08, 14, 16 and 19 Nineteen
polymorphisms
unique
to
the
3. Evolutionary relationships among lowpassage strains from North 4. Non-synonymous variants unique America to the primary or recurrent infection in ≥ 10% of read depth
primary or recurrent isolate were identified, 10 in Subject A and 9 in Subject B. These observations suggest remarkable genetic conservation between primary and recurrent
6. Comparison of nonsynonymous changes to the HG52 strain
5. Non-synonymous variant producing a change in protein conformation: the case of pUL37
episodes of HSV-2 infection, and imply strong selection pressures exist to maintain fidelity of the viral genome during repeated reactivations from its latent state. The genome conservation observed has implications for the potential success of a therapeutic vaccine.
3. Evolutionary relationships among lowpassage strains from North America HSV-2 genomes obtained from Kolb et al. (2015) HSV-2 genomes obtained from Newman et al. (2015) HSV-2 genomes generated in this research
Laboratory strain South African
4. Non-synonymous variants unique to the primary or recurrent infection in ≥10% of read depth Subject A Gene
Nucleotide / Amino acid pos.
UL13
569 / 190
UL13
685 / 229
UL14
334 / 112
UL27
206 / 69
UL30
3398 / 1133
UL36
7942 / 2648
UL37
1478 / 493
Nucleotide variant
Amino acid variant
Primary (sample 8) >> Recurrent (sample 14)
Function
Primary infection
Recurrent infection
C (86%), T (14%) >> C (98%), T (1%), A (1%)
A(86%), V(14%)
A(98%)
Protein kinase
G (100%) >> G (90%), T (10%)
D(100%)
D(90%), Y(10%)
Protein kinase
R(70%), C(30%)
R(100%)
P (100%)
L(52%), P(48%)
Glycoprotein B
C (98%), T (1%), A (1%) >> T (60%), C (40%)
P(98%)
L(60%), P(40%)
DNA polymerase
G (100%) >> G (65%), A (35%)
A(100%)
A(65%), T(35%)
Tegument protein
P(93%)
H(99%)
C (70%), T (30%) >> C (99%), A (1%) C (100%) >> T (52%), C (48%)
A (7%), C (93%) >> A (99%), G (1%)
5. Non-synonymous variant producing a change in protein conformation Compared with previously crystallized proteins, structural modifications in the protein conformation were only
Tegument protein
observed in the tegument protein UL37 (H493P). pUL37 is essential for HSV replication, plays an important role in capsid trafficking and interacts with the gK-UL20 protein
Tegument protein
Subject B Gene
Nucleotide / Amino Acid pos.
UL13
862 / 288
UL13
1133 / 378
UL14
241 / 81
UL21
1039 / 347
UL30
2852 / 951
Nucleotide variant Primary (sample 16) >> Recurrent (sample 19)
complex to facilitate virion envelopment.
Amino Acid variant
Function
Primary infection
Recurrent infection
R(86%), STOP(14%)
R(98%)
Protein kinase
G(100%)
G(86%), V(14%)
Protein kinase
V(99%)
V(79%), M(21%)
Tegument protein
G (67%), A (33%) >> G (100%)
A(67%), T(33%)
A(100%)
Tegument protein
G (80%), A (20%) >> G (100%)
C(80%), Y(20%)
C(100%)
DNA polymerase
C (86%), T (14%) >> C (98%), T (2%) G (100%) >> G (86%), T (14%) G (99%), T (1%) >> G (79%), A (21%)
%Novoalign of Illumina read depth supporting each polymorphism. % of mapped reads supporting each polymorphism. Underlined text indicates whether the polymorphism was found in the primary or
Domain III is a highly conserved helical bundle, with a central helix (19) surrounded by six helices (16-22) (Pitts et al., 2014). Ninety-three percent of reads from primary isolate (sample
recurrent sample. Red bold line indicates a non-synonymous variant that produced a change in protein conformation when is compared with the
Underlined text indicates whether the polymorphism was found in the primary or recurrent sample.
crystal structure of the pseudorabiesvirus pUL37 N-terminal half (Pitts et al. (2015); PDB: 4K70; residues: 24-570; 100% confidence)
Red bold text indicates a non-synonymous variant that produced structural modifications in the protein conformation when compared with the crystal structure previously published (Pitts et al., 2014; PDB: 4K70; Scotland, high passage strain
08) specified proline at residue 493, whose replacement by histidine in the recurrent isolate (sample 14) disrupted the highly conserved helix 21.
residues: 24-570; 100% confidence) .
Conclusions Bayesian Majority Rule consensus tree. Bayesian posterior probabilities (>90%), and Maximum Parsimony bootstrap (100 replicates; >75%) / the Maximum Likelihood bootstrap (1,000 replicates; >75%) are represented above and below the branches, respectively.
6. Comparison of non-synonymous changes to the HG52 strain
ACKNOWLEDGEMENTS: We thank Paul Cliften for helpful advice and discussion and Juan A. Villa for helpful discussions about Phyre2 and PyMOL. KEY PUBLICATIONS: Minaya et al. J. Virol. In Press; Kolb et al. J. Virol. 89: 6427-34, 2015; Newman et al. J. Virol. 89: 8219-32, 2015; and Pitts et al. J. Virol. 88: 5462-73, 2014. FUNDING: This work was funded by DMID contract HHSN272200800003C to RBB. This presentation was made possible in part by Grant UL1 RR024992 from the NIH-National Center for Research Resources (NCRR), by the Pershing Trust, and institutional funds to LAM.
-
# non-synonymous changes
+
• Strong sequence homology was observed when comparing each subject’s isolate from primary and fifth/sixth recurrent episode, suggesting strong selective pressure during reactivations of virus from its latent state within an individual host. • Phylogenetic analyses demonstrate that the North American HSV-2 strains presented in this research are more closely clustered to the HG52 laboratory strain from Scotland than the low-passage clinical isolate SD90e from South Africa or laboratory strain 333. Thus, our sequences would make a logical choice as a reference strain for inclusion in future studies of European and North American HSV-2 isolates. • UL3 (nuclear phosphoprotein), UL11 (tegument protein), UL39 (ribonucleotide reductase), and UL43 (envelope protein) are the HSV-2 genes with the highest number of nonsynonymous changes compared to HG52. These regions may allow a variant to emerge which could, for example, evade host immune surveillance or adapt to a new host’s genetic makeup. • The H493P polymorphism is predicted to disorder a highly conserved portion of the helix 21, which may have consequences for the function of pUL37.