Epistasis and Evolution BGGN223 Spring 2016 Sergey Kryazhimskiy
4 May 2016
Lecture outline Part I. Epistasis as a tool in genetics Part II. Evolutionary implications of epistasis 1. Speciation (BDM model) 2. Sex (Kondrashov’s hatchet) 3. Fitness landscapes a. Protein fitness landscapes b. Whole-genome fitness landscapes
Part I. Epistasis as tool in genetics
different p h e n o t y p e s from the wild type and from each other, and the double mutant p h e n o t y p e looks like one of the p h e n o t y p e s p r o d u c e d by a single mutation, we say that this mutation is epistatic to the other. In this example, tra-i- is epistatic to her-1. As illustrated in Fig. 1, these results are explained by a model in which X c h r o m o s o m e dosage regulates her-1 activity, her-I negatively regulates tra-I, and tra-i is required to direct hermaphrodite d e v e l o p m e n t in place XX in of the male ground state. AnXO alternative model which tra-1 regulates her-1 is inconsistent with the epistasis of tra-1- to her-I-. WT Is a downstream mutation always epistatic to an upstream mutation? The answer is no. For example, consider a positive regulatory pathway, p r o g r a m m e d cell – het-1 death in C. elegans (Fig. 2) ~'. In this model, a signal present in cells that are fated to die turns on ced-3. In turn, ced-3 activates– unknown genes that kill the cell, tra-1 and a known gene, ced-1, that causes it to be engulfed by neighboring cells. In a ced-3- mutant none of these het-1is– turned on, and the cell remains a downstream genes – a ced-1- mutant, ced-3 still causes normal, livingtra-1 cell. In
a ced-3- single mutant, since ce without ced-~. Thus ced-3- is e
Recap: Epistasis as genetic tool
So, there's a problem. We s used to figure out the order of case the downstream gene is e gene, and in another the upstr the downstream gene. The p more complicated if constitutiv ered. How then can epistasis in a regulatory pathway? The rules that determine whether stream gene will be epistatic. What are the assumptions b determine experimentally wh given problem? For a certain class of regul answer these questions. These hierarchies that are controlled and that obey the conditions determination, the signal is X which can be d e d u c e d using tations. In p r o g r a m m e d cell d known, but can correlation with c ligands, intracellu male DNA damage, ti ~ hermaphrodite organism are sign development Null and constit genes to fail to r hermaphrodite null mutant gene mutant gene is
Sex determination in C.elegans
M
H
H
H
M
M
M
XO
ON
X dosage
her-1
XX
OFF
Avery, Wasserman, TIG 1992
M
OFF
'"'b" I tra-1 ON FIGH
Annu. Rev. Genom. Human Genet. 2013.14:111-133. Downloaded from www.annualreviews. Access provided by WIB6417 - Max-Planck-Gesellschaft on 12/24/14. For personal use onl
geted genetic interaction studies fail to uncover connections between dive and result in a potentially biased view of the global topology of the genet
Can we do this at genome scale? • What
mutations?
Deletion collection
• What
phenotype? Viability
SGA
dSLAM
MATα query mutation Array of MATa strains
Transformation Pool of heterozygous of the query mutation a/α diploids
Array of double heterozygous diploids
Po he
Sp
Sporulation on array
Po mu
Array of double mutants
Ba an
Synthetic Genetic Array Colony size measurement
Baryshnikova et al, Annu Rev H Genet 2013
MATα que mutatio
Wild-type alleles Deletion mutations
Hy ba
REPORTS the functional relationships between genes and pathways. The SGA synthetic lethal data set was first imported into the Biomolecular Interaction Network Database (BIND) (19), then formatted with BIND tools (16) and exported to the Pajek package (20), a program originally designed for the graphical analysis of social interactions. The network shown in Fig. 3 contains the interactions observed for BNI1 and those for seven other query genes, BBC1 (MTI1), ARC40, ARP2, BIM1, NBP2, SGS1, and RAD27, as described below. The network contains 204 genes, represented as nodes on the graph, and 291 genetic interactions, represented as edges connecting the genes. To visualize subsets of functionally related genes, we color-coded the genes according to their YPD cellular roles and aligned them with one another on the basis of their roles and connectivity (16).
The function of the genes with unknown cellular roles (colored black) is predicted by the roles of surrounding genes that show a similar connectivity. If these interactions identify functionally related genes, then some of the uncharacterized genes from the bni1! screen should also participate in cortical actin assembly or spindle orientation. To test this, we conducted an SGA screen using a strain deleted for a previously uncharacterized gene, BBC1, which leads to a synthetic sick phenotype in combination with bni1!. We scored 17 potential synthetic lethal/sick interactions for bbc1!, most of which have YPD-classified cell polarity or cell structure (cytoskeletal) roles (Fig. 3). In particular, bbc1! showed interactions with several genes whose products control actin polymerization and localize to cortical ac-
) contain two double-mutant spores; and parental res were micromanipulated onto distinct positions et al, Science dTong to germinate to form2001 a colony. bni1! bnr1! and
Downloaded from ww
Epistasis at genome scale
tin patches (CAP1, CAP2, SAC6, and SLA1), suggesting that BBC1 may be involved in assembly of actin patches or their dependent processes. Further experiments demonstrated that Bbc1 localized predominantly to cortical actin patches and binds to Las17 (Bee1), a member of the WASp (Wiskott-Aldrich Syndrome protein) family proteins that controls the assembly of cortical actin patches through regulation of the Arp2/3 actin nucleation complex (21, 22). We next focused on ARC40 and ARP2, both of which encode subunits of the Arp2/3 complex (23), a major regulator of actin nucleation, the rate-limiting step for actin polymerization. Because ARC40 is an essential gene, we first isolated a temperature-sensitive conditional lethal allele, arc40-40, by polymerase chain reaction (PCR) mutagenesis and then conducted the screen at a tempera-
Epistasis for quantitative traits Trait value
Relative effect
Example 1
Example 2
100%
100%
AB
x0
aB
x1
x1/x0
50%
50%
Ab
x2
x2/x0
50%
50%
ab
x12
x12/x0
100%
30%
Interesting
Maybe x2 Expectation for trait value of allele b in background a: x1 x0 x12 x1 x2 Measure of epistasis: " = x0 x0 x0
Gen
Does quantitative epistasis reveal anything interesting? 0 –1 0 200
1
Segre et al, Nat Genet 2005
b
600
Gene pairs
http://www.nature.com/naturegenetics
Computational model of S.cerevisiae metabolic! network Delete each enzyme, compute growth rate Delete pairs of enzymes, compute growth rate Compute measure of epistasis
400
200
0 –1
0 ~ !
1
© 2005 N
Does quantitative epistasis reveal This classification can be represented as a genetic network of remaining 79% buffering and aggravating links between genes. To understand the tions tend to b anything interesting? overall organization of the network, we started with a supervised (P o 10
–4
group
Nominal analysis of the total number of buffering O– and aggravating interac- C– interactions bet tions between groups of genes defined bymatrix preassigned functional suggest that it i Epistasis annotation23. Pairs of epistatically interacting genes were more likely gene modules3 to share the same annotation than would be expected by chance disruption in a (21% relative to 10% expected for random pairs, Pgene E 10!11; synthesis of a c Fig. 2a). Much additional information on functional organiza- either buffer or tion, however, can be extracted by looking for patterns in the of a second fun
a
b 1 8
2
3
1
s classification can be represented as a genetic network of 7 3 ing and aggravating links between genes. To understand the 8 7 l organization 6 of the network, we started with a supervised 4 6 5 4 is of the total number of buffering and aggravating interac2 5 Segre et al, Nat Genet 2005
re tio (P 3 in
Quantitative epistasis reveals metabolic modules LETTERS
a
P P
ERG3
P P
U14 P P
P
ERG4 ERG5
P
P
ERG25
GLUCN
GLYC
YLR100C ERG26
A
A
H
PGI1 A
FBA1
ERG2
STEROL
P
A
U13
I′
R
U222
R
U216
R
R′
U86
TPI1 I
ETHxt
R
U205
R
U46 M
PENT
I
ZWF1 B
R
U85
FBP1
CDC19
S
B
R′ I′
A
PFK1
R R
PCK1
ERG6
A
blishing Group http://www.nature.com/naturegenetics
H
PGK1
COA
ACAL
M
PYRD
BPH1
F
O
URK1 X
U111
R
U22 M
F
E
U122
U96 U93 U94 YDR531W ECM31 PAN5 YIL145C
URA4
M
URA2
PYR PDB1
O
RNR1 O
Q′
CPA2
IDP
ATPs G
C
ATP8
F
H
U133
J
IDP1
J
LYS20 J
U
PROcat
C L F F
F
PUT1 U134
D
U120
C
U109
L
COX1 D′ NCP1
L
B J TKL1 ARO3 Segre et al, Nat Genet 2005
J
U52
U
LSC2
PRO2
F
TCA
PRO1 Q′
U17
V
SAM2
U49
U129
P
HMG1
J
K
MET10
Q
J J J
LYSbs
CAR2
M
LYS12
U
U50
PRObs
RESPIR
U
LYS4
J
KGD2
U35
TRPcat
MAE1
E
U36
URA3
Q′
ADH3 A
U98
O
FUR4
URA
RPE1
U92
ASN2
C
BNA1 U55 U56 YBL098W U53 U54 YLR231C YJR078W
ACO1
© 2005 Nature
B
J
TKL2
J
ARO4
M
TYR1
V
DCD1
Quantitative epistasis reveals b metabolic modules
Segre et al, Nat Genet 2005
E
ACS2
W
SLC1
T
GSY1
K
MET22
C
IDP2
P
SAM1 N
ADE16
HMG2 B
PG
Figure 4 Unsupervised o Prism algorithm. (a) Buf network. Genes (black no modules (enclosing boxe Fig. 2a) correlate well wi buffering links, arrows p deletion with the smaller buffered by the presence names are indicated on t U followed by a number unassigned genes (see U parameter a ¼ 0.3 was u system-level view of inte predictions of module-m between LYSbs and TRP (for details and additiona ‘Buffering chains’, such owing to the coherent di not necessarily have tran which buffers PRObs, th interacting functional mo schematic metabolic cha Alberts et al. Reproduce Inc. Copyright 2002. Fun main common metabolic and acetate metabolism coenzyme-A biosynthesis GLYC, glycolysis; IDP, is PENT, pentose phosphat proline catabolism; PYR STEROL, sterol biosynth
RESEARCH ARTICLE
Can we do this experimentally? The Genetic Landscape of a Cell
Michael Costanzo,1,2* Anastasia Baryshnikova,1,2* Jeremy Bellay,3 Yungil Kim,3 Eric D. Spear,4 Carolyn S. Sevier,4 Huiming Ding,1,2 Judice L.Y. Koh,1,2 Kiana Toufighi,1,2 Sara Mostafavi,1,5 Jeany Prinz,1,2 Robert P. St. Onge,6 Benjamin VanderSluis,3 Taras Makhnevych,7 Franco J. Vizeacoumar,1,2 Solmaz Alizadeh,1,2 Sondra Bahr,1,2 Renee L. Brost,1,2 Yiqun Chen,1,2 Murat Cokol,8 Raamesh Deshpande,3 Zhijian Li,1,2 Zhen-Yuan Lin,9 Wendy Liang,1,2 Michaela Marback,1,2 Jadine Paw,1,2 Bryan-Joseph San Luis,1,2 Ermira Shuteriqi,1,2 Amy Hin Yan Tong,1,2 Nydia van Dyk,1,2 Iain M. Wallace,1,2,10 Joseph A. Whitney,1,5 Matthew T. Weirauch,11 Guoqing Zhong,1,2 Hongwei Zhu,1,2 Walid A. Houry,7 Michael Brudno,1,5 Sasan Ragibizadeh,12 Balázs Papp,13 Csaba Pál,13 Frederick P. Roth,8 Guri Giaever,2,10 Corey Nislow,1,2 Olga G. Troyanskaya,14 Howard Bussey,15 Gary D. Bader,1,2 Anne-Claude Gingras,9 Quaid D. Morris,1,2,5 Philip M. Kim,1,2 Chris A. Kaiser,4 Chad L. Myers,3† Brenda J. Andrews,1,2† Charles Boone1,2† A genome-scale genetic interaction map was constructed by examining 5.4 million gene-gene pairs for synthetic genetic interactions, generating quantitative genetic interaction profiles for ~75% of all genes in the budding yeast, Saccharomyces cerevisiae. A network based on genetic interaction profiles reveals a functional map of the cell in which genes of similar biological processes cluster together in coherent subsets, and highly correlated profiles delineate specific pathways to define gene function. The global network identifies functional cross-connections between all bioprocesses, mapping a cellular wiring diagram of pleiotropy. Genetic interaction degree correlated with a number of different gene attributes, which may be informative about genetic network hubs in other organisms. We also demonstrate that extensive and unbiased mapping of the genetic landscape provides a key for interpretation of chemical-genetic interactions and drug target identification.
T
he relation between an organism's genotype and its phenotype are governed by myriad genetic interactions (1). Although
Costanzo et al, Science 2010
a complex genetic landscape has long been anticipated (2), exploration of genetic interactions on a genome-wide level has been limited.
Systematic deletion analysis in the buddi yeast, Saccharomyces cerevisiae, demonstra that the majority of its ~6000 genes are in vidually dispensable, with only a relative
~ 56,250 96wps ~ 16 freezers Banting and Best Department of Medical Research, Terre
1
Donnelly Centre for Cellular and Biomolecular Resear University of Toronto, Toronto, Ontario M5S 3E1, Cana 2 Department of Molecular Genetics, Terrence Donnelly Cen for Cellular and Biomolecular Research, University of Toron Toronto, Ontario M5S 3E1, Canada. 3Department of Compu Science and Engineering, University of Minnesota, Minneapo MN 55455, USA. 4Department of Biology, Massachus Institute of Technology, Cambridge, MA 02142, U 5 Department of Computer Science, University of Toron Toronto, Ontario M5S 2E4, Canada. 6Department of Bioche istry, Stanford Genome Technology Center, Stanford Univers Palo Alto, CA 94304, USA. 7Department of Biochemis University of Toronto, Toronto, Ontario M5S 1A8, Cana 8 Department of Biological Chemistry and Molecular Ph macology, Harvard Medical School, Boston, MA 02115, U 9 Samuel Lunenfeld Research Institute, Mount Sinai Hospi 600 University Avenue, Toronto, Ontario M5G 1X5, Cana 10 Department of Pharmacy, University of Toronto, Toron Ontario M5S 3E1, Canada. 11Department of Biomolecu Engineering, University of California, Santa Cruz, CA 950 USA. 12S&P Robotics, Inc., 1181 Finch Avenue West, No York, Ontario M3J 2V8, Canada. 13Institute of Biochemis Biological Research Center, H-6701 Szeged, Hunga 14 Department of Computer Science, Lewis-Sigler Instit for Integrative Genomics, Carl Icahn Laboratory, Prince University, Princeton, NJ 08544, USA. 15Biology Departme McGill University, Montreal, Quebec H3A 1B1, Canada.
Major data analysis issues
*These authors contributed equally to this work. †To whom correspondence should be addressed. E-m
[email protected] (C.L.M.); brenda.andrews@utoronto (B.J.A.);
[email protected] (C.B.)
number of different gene attributes, which may be informative about genetic network hubs in other organisms. We also demonstrate that extensive and unbiased mapping of the genetic landscape provides a key for interpretation of chemical-genetic interactions and drug target identification.
Downloaded from www.sciencemag.org
14 Department of Computer Science, Lewis-Sigler Institute for Integrative Genomics, Carl Icahn Laboratory, Princeton University, Princeton, NJ 08544, USA. 15Biology Department, McGill University, Montreal, Quebec H3A 1B1, Canada.
Genome-wide epistasis in yeast T *These authors contributed equally to this work. †To whom correspondence should be addressed. E-mail:
[email protected] (C.L.M.);
[email protected] (B.J.A.);
[email protected] (C.B.)
a complex genetic landscape has long been anticipated (2), exploration of genetic interactions on a genome-wide level has been limited.
he relation between an organism's genotype and its phenotype are governed by myriad genetic interactions (1). Although
Fig. 1. A correlation-based network connecting genes with similar genetic interaction profiles. Genetic profile simRESEARCH ilarities were measured for allARTICLE gene pairs by computing Pearson correlation coRibosome & efficients (PCCs) from the complete getranslation netic interaction matrix. Gene pairs whose profile similarity exceeded a A were connected PCC > 0.2 threshold in the network and laid out using an Autophagy edge-weighted, spring-embedded, network layout algorithm (7, 8). Genes RNA sharing similar patterns of genetic processing interactions are proximal to each other; less-similar genes are positioned farther apart. Colored regions indicate sets of genes enriched for GO biological processes summarized by the Amino acid biosynthesis indicated terms. Chromatin & & uptake
Mitochondria Peroxisome
Metabolism & amino acid biosynthesis Endosome & vacuole sorting
Cell polarity & morphogenesis
Secretion & vesicle transport tRNA modification
Protein folding & glycosylation
transcription
Cell wall biosynthesis osynthe wall &Cell integrity egrity
biosynthesis
Nuclearcytoplasmic transport Nuclear migration Signaling & protein degradation
Cell polarity & morphogenesis ER-dependent protein degradation n
Mitosis & chr. segregation
olgi ER/Golgi
GDH1
CIT2
Glutamate biosynthesis
URE2
MKS1
Membrane trafficking & fusion
SCIENCEC VOL 327
www.sciencemag.org
B YPT7
COPI coatomer complex
IDH1
COP1
LST8
MON1
RTG2 RTG1
22 JANUARY 2010
IDH2
COG2 COG3
SEC28
COG8
CDC11
ARO1
VPS8
ARO2 SEC13 LST4
VPS33
PEP3 VAM6
MEH1
HOPS/ CORVET
SEH1 SLM4
GAP1 GTR1
Homoserine, chorismate & serine biosynthesis
MTC5
VPS35 VPS29
VPS5
GET4
gtr1∆
RGA2 BEM1
CDC3
ROM2
AXL2
CLA4
BEM4
AXL1 BEM2
PEP8 VPS17
GET3
GET5 IKI3
GET1
SGT2
GET pathway
Retromer complex
par32∆
ecm30∆
ELP3
ELP2
ELP4
ELP6
NCS2 UBA4
GET2
Elongator complex
ubp15∆
F
NCS6
URM1 TUM1
IKI1
Gap1 sorting pathway WT
SPH1
Cell polarity establishment/ maintenance
HOM3 ARO7
PAR32 ECM30 UBP15 5
Costanzo et al, Science 2010 E
SPA2
PEA2
SHS1
HOM6
VPS16
BUD6
CDC10
CDC12 SER1
VPS41
BNI1
CDC11 CDC3
COG7 COG6
SER2 HOM2
VPS3
Polarisome
Septin complex
COG5
RET2
ILV1
425
D
Conserved Oligomeric Golgi (COG) complex
SEC27
RTG3
CCZ1
DNA replication & repair
KTI12 ATS1
Peptide count
Urmylation pathway
Downloaded from www.sciencemag.org on August 2, 2011
Protein folding & glycosylation
Part I conclusions • Qualitative epistasis is informative about functional relationships • Quantitative epistasis can be informative • Genome-wide patterns of epistasis • in theory can reveal organizational principles of the organism • in practice (so far) reveal primarily physical interactions between gene products
Part II. Evolutionary implications of epistasis
or more distinct peaks. The presence of multiple peaks indicates reciprocal sign epistasis, and may cause severe frustration of evolution (Fig. 1b). Indeed, reciprocal sign epistasis is a necessary condition for multiple peaks, although it does not guarantee it: the two optima in the diagram may be connected by a fitness-increasing path involving mutations in a third site. Phenotype or fitness Fitness
Epistasis determines evolutionary trajectories and outcomes
aB ab
AB
Ab No epistasis
Magnitude epistasis
Sign epistasis
Reciprocal sign epistasis
of Organismic and Evolutionary Biology, Harvard University, 16 Divinity Avenue, tionary Biology, and Center for Computational Molecular Biology, Brown University, Poelwijk et al, Nature 2007
Evolution of reproductive barriers Hybrids are often sterile or inviable
One locus model Species 1
Hybrid
Species 2
aa
aA
AA
fit
unfit
fit
ancestral
intermediate
derived
How can natural selection produce AA from aa if aA is unfit?
Bateson-Dobzhansky-Muller model Two loci model (epistasis) Species 1
Species 2
AAbb
aaBB
aAbb
aabB aabb Ancestor
Hybrid AaBb inviable/sterile if A incompatible with B
Bateson-Dobzhansky-Muller model California copper tolerant Sl Sl T T
Monkey flower Mimulus guttatus
Great Britain non-tolerant
aa
tt Aa
tt aa Ancestor non-tolerant
Hybrid inviable MacNair, Christie, Heredity 1983
Paradox of sex Costs of sex • Time and energy to find mates • Risk of not finding mates • Risk of disease transmission • Offspring could be less fit than parents • 2-fold cost of sex: mutation causing asexual reproduction would double fitness
If sex is so costly, why do so many organisms (eukaryotes) reproduce sexually?
Deterministic mutation hypothesis Asexual population Mutation 100%
Fitness
Fraction of population
1
Mutational pressure
0
1
2
3
4
# of mutations Kondrashov, Genet Res 1982
5
6
Deterministic mutation hypothesis Sexual population Recombination 100%
Fitness
Fraction of population
1
0
1
2
3
4
# of mutations Kondrashov, Genet Res 1982
5
6
Deterministic mutation hypothesis Sexually reproducing subpopulation would spread in an asexual population
Fitness
1 Mutational pressure Recombination pressure 0
1
2
3
4
# of mutations
5
6
Works when epistasis is negative Kondrashov, Genet Res 1982
.687/%+"9+@
3&&+0/#/, &
Is epistasis between deleterious mutations predominantly negative? ?.3+%/@&()-$("# A+%/@-(% B.3 @%")/,,(#0 !/&&+);)&/+A 8()%"$676&/, 2/#/%-& $%-#,@"%$
' (
LETTERS
#
"
;11=
"#) "#' "#* "#(
50
"#" #
$("#,+%/@"%$/5
(
"$)
A/"#)
ture.com/naturegenetics
B" B#$% B#$& B#$' B#$( # #$( #$' #$& #$% 8343.19/14.3,:9.1-4/59-,3/;<=
-
C($")="#5%(A+@/%"D("8/, 4"&-%($;+A 200 )/&&+<-&& B(7","8/+A +,-.$/01-2343515 150 C/87%-#/ $%-99() !=%"8-$(#+A 100 $%-#,)%(@$("#
"
0
–0.4
–0.2
0.0 ! (1/h)
0.2
0.4
Figure 3 Frequency! distribution of the epistatic effect e. Its mean valu 0.024, is significantly higher than zero (t ¼ 5.697, n ¼ 639, two-tail P ¼ 1.864 " 10E–08). The distribution is slightly skewed to the righ (g1 ¼ 0.282, t ¼ 2.917, P ¼ 0.0037) and leptokurtic (g2 ¼ 2.803, #$' t ¼ 14.518, P o 0.0001).
Apparently not at least between knock out mutations in yeast
Costanzo et al, Science 2010
1/X/#$%6
9-,3/;<= @$
.687/%+"9+@-(%,
"#&
B" B#$% B#$& B#$' B#$( # #$( #$' #$& #$% 8343.19/14.3,:9.1-4/59-,3/;<=
C/$-7"&(,8
#$(
Jasnos, Korona, Nat Rev Genet 2007 estimate, 0.44%, was obtained study of synthetic #in an earlier
ere frustration of evolution a necessary condition for ee it: the two optima in the reasing path involving
Epistasis can cause problems for adaptation
epistasis
Many loci (schematic)
Two loci
Reciprocal Five loci sign epistasis
versity, 16 Divinity Avenue, ular Biology, Brown University,
383
Wright, Proc 6th Int Congress Genet 1932
Epistasis can cause problems for adaptation • Can populations move from one peak to another?
Many loci (schematic)
• Can populations efficiently find fit genotypes? • How much historical contingency is there in evolution? • Is evolution predictable? Wright, Proc 6th Int Congress Genet 1932
What properties do fitness landscapes in nature have? A. Single protein B. Whole genome
Single protein fitness landscapes
pleiotropy represents the mechanistic basis of sign epistasis. Seen as an analysis of clinical cefotaxime resistance evolution, our treatment makes several simplifying assumptions about the mutational and selective processes. For example, we have disregarded horizontal gene transfer and have limited attention to only five mutations. Furthermore we have assumed that selection acts only to increase resistance to cefotaxime, whereas
a set of point mutations known jointly to increase organismal fitness, how does Darwinian selection regard the many mutational trajectories available? The foregoing limitations notwithstanding, the implications of our study for this broader question are clear: When selection acts on TEM wt to increase cefotaxime resistance, only a very small fraction of trajectories to TEM* are likely to be realized, owing to sign epistasis mediated by intramolecular pleiotropic
Fitness landscape of betalactamase
tion genetic model to th between an engineered NAD NAD-dependent forms of reveals that at most 29% trajectories are selectively ing online text). Our conclu ent with results from prosp evolution studies, in which ary realizations have been largely identical mutationa However, the retrospect strategy employed here (1 riches our understanding molecular evolution becau characterize all mutational ing those with a vanishing of realization [which is ot (27)]. This is important b tention to the mechanistic inaccessibility. It now app lecular interactions render trajectories selectively inac plies that replaying the prot might be surprisingly repe be seen whether intermo similarly constrain Darw larger scales of biological o References and Notes
Weinreich et al,
Fig. 2. Mutational composition of the 10 most probable trajectories from TEM wt to TEM*. Nodes represent alleles whose identities are given by a string of five þ or – symbols corresponding (left to right) to the presence or absence of mutations g4205a, A42G, E104K, M182T, and G238S, respectively. Numbers indicate cefotaxime resistance (12) in mg/ml. Edges represent mutations, as Science 2006probability of each beneficial mutation is represented on a log scale by color and labeled. The relative
1. C. Walsh, Antibiotics: Actions, (American Society for Microbio 2003). 2. A. A. Medeiros, Clin. Infect. Di 3. G. A. Jacoby, K. Bush, TEM Ext Inhibitor Resistant b-Lactamas Studies/temtable.asp. 4. N. Watson, Genet. Anal. Tech. 5. B. G. Hall, M. Barlow, Drug Re (2004). 6. W. P. C. Stemmer, Nature 370 7. B. G. Hall, Antimicrob. Agents (2002). 8. R. P. Ambler et al., Biochem. J
Thermodynamic stability and fitness 1 Pfolded =
1+e
G RT
1
Pfolded
∆G between –10 and –3 kcal/mol
0.75 0.5 0.25 -5
-4
-3
-2
-1
0
1
2
∆G
Fitness ~ a [E]active = a [E]total Pfolded(∆G) Wikipedia
3
598 Biophysical methods
Effects of mutations on stability Figure 2affect ∆G • Most mutations
• Most mutations are destabilizing, ∆∆G > 0 • ⟨∆∆G⟩ ≈ +2 kcal/mol • Effects of mutations on ∆G are additive • Mutations in active site are almost always destabilizing Tokuriki, Tawfik, Curr Op Struct Biol 2009
“Universal” distribution of stability effects of mutations
Stability threshold model 1
Pfolded 0.75 0.5 0.25 -5
-4
-3
-2
-1
∆G
0
1
2
3
Unfolding Aggregation Degradation
↓ Dynamics ↓ Activity ↓ Regulation
∆Gopt
Stability-based model of protein Stability evolution b Advantageous
1
2
3 6
4
Neutral
2
Stability
crease ategy en for ubseestore itness tabiloss of g and tants. affect s that highly biotic es the cataizing/
2 5
Deleterious
Functional replacement
Compensation
Time
Figure 1 | The between protein stability, DePristo et al, Nat Rev Genetrelationship 2005
Whole-genome fitness landscapes
How do we study whole-genome fitness landscapes? • Measure effects of many mutations (e.g., deletion collections) • Compare genomes of sister species (e.g., human chimp) • Find segregating variants within populations • Observe forward adaption in lab
The basic evolution experiment
Saccharomyces cerevisiae
growth
–80°C
transfer
… –80°C
Measure fitness by direct competition with ancestor Evolved
Ratio 1
Ratio 2
Labeled Ancestor
Fitness ∝ ln( Ratio 2 ) – ln( Ratio 1)
Identify adaptive mutations by parallel evolution in replicate lines YMR1 YDR222W OSW5
ATG8 Evolved 3
AMD1
VID30 ZWF1 ECM21 YPR1
Evolved 1
TRP1
FLR1
GAL11 QRI1 TDA9 SMF1 PSE1 ENP1 PUF6 STE11 PRT1
FKS1
WHI2
TOP1
EDE1 HIP1
MKK2
ROT2 ACE2 YOR389W IRA1 SEC6
LAG2
STE4 MGA1 STE6 SET2
HFD1
EOS1
Evolved 2
LTEE = long-term evolution experiment in E.coli
Richard Lenski 12 lines started in 1988 > 60,000 generations in minimal media
(14). This exercise indicates that we are far from detecting all possible beneficial mutations (Fig. 2B). However, the discovery of affected genes, operons, and functional units was nearly saturated, which suggested that fewer replicates may have recovered the major targets of selection.
>3 point mutation possible sites of be to yield our 400 obs L = 850 is a min assumes no varianc the addition of var
Adaptive trajectories in lab are highly stereotypical Thermal adaptation LTEE
Good et al, Genet 2015
Tenaillon et al, Science 2012
Figu traje vidua six c lation (201 of tw assay from fitne ulatio with For depic in E para and circle of m
where 〈s1 〉 and 〈t1 〉 are the beneficial effect and fixation time, respectively, for the first fixed mutation. Comparing this formula with the power law, g = 1/2a. The value of g estimated for the six populations that retained the low ancestral mutation rate throughout 50,000 generations is 6.0 (95% confidence interval 5.3 to 6.9). In the LTEE, the beneficial effect of the first fixation, 〈s1 〉, is typically ~ 0.1 (1, 9, 10). It follows that the dis-
events (13). Beneficial mutations of advantage s are exponentially distributed with probability density ae – as, where 1/a is the mean advantage. This distribution is for mathematical convenience; the theory of clonal interference is robust to the form of the distribution (13). We assume that deleterious mutations do not appreciably affect the dynamics; deleterious mutations occur at a higher rate than beneficial mutations, but the resulting
ulations and those tha mutation rate. Diminishing-retu power-law dynamics a and g. Clonal interf through the paramete and 〈t1 〉, which in tu ulation size N, bene initial mean benefici LTEE, N = 3.3 × 107 the daily dilutions an m and a0 are unknow match the best fit t tained the low mutatio The expected values fixation times across a Fig. 3B. The dynam with high beneficial giving 〈s1 〉 ≈ 0:1 and the first fixation, whi tions from the LTEE ( m, adaptation becom beneficial mutations, consistent with the L dicts that the rate of a sharply than the rate S5), which is qualitati tions (10, 11). The mo beneficial mutations s “cohorts” of benefici especially at high m (1 inferred role of dimin population mean-fitn
Can we explain the shape of the typical fitness trajectory? LTEE
Experimental approach: Do we see epistasis? Wiser et al, Science 2013
Theoretical approach: Is epistasis necessary?
tion had the opposite effect. These data support butes to declining rates of adaptation over time. de study, in contrast to its prevalence in an earlier
amine the repeatability of adaptive (mean =evolution, –0.014, t10 = –3.942, P = 0.003), and subsequent studies have documented whereas it was many significantly positive among those with the evolved allele (mean = 0.015, t14 = 4.913, examples of both phenotypic and genetic paralP < 0.001) (fig. S4). lelism (24, 27–30). Nevertheless, replicate pop-Thus, the pykF mutation enhances fitness through its epistatic interactions ulations have diverged in otherwith phenotypic and mutations. However, the the other evolved genetic traits (24, 31, 32). A striking examplewith of ancestral and evolved pykF sets of genotypes alleles exhibit negative correlations between divergence is the ability to grow onboth citrate that relative epistasis and evolved in only 1 of the 12 populations (32). Asideexpected fitness values (ancestral pykF: r = –0.923; evolved pykF: r = –0.610) from that case, all of the populations a strong (Fig. show 3 and fig. S5). As a consequence, the slope tendency toward decelerating rates of fitness increase (27, 28). Fig. 4. Relation bemarginal Whole-genome sequencingtween of atheclone thatfitness effect of adding a was isolated after 20,000 generations from one particular mutation and of the populations (designated the Ara-1) fitnessidentified of the pro● ● ● genitor background 45 mutational differences from the ancestor (24).to which waspopulaadded for Many other mutations appeared in itthe each one of five focal tion, of course, but most were eliminated by ranmutations. Each panel dom drift or negative selection.includes Otherthe beneficial Pearson cor● relationthat coefficient and ● mutations also arose, including some reached ● its significance. The open ● detectable frequencies, but these were by symbols showlost the effects ● interference from superior beneficial of adding mutations each focal mutationbecause to the ancestral (24, 26) and, in at least one case, they strain. were less able to evolve than the eventual winners (33). Here we focus on the first five mutations that fixed in this population and whose ● ● spread coincided with the period of fastest adaptation. These mutations together produced ● ● ●
negative relation between expected fitness and epistatic deviations, although the details of this relation also clearly depend on the particular beneficial mutations involved. We also examined the relation between fitness and epistasis by arranging the 32 genotypes into 16 pairs, such that each pair differed only by the presence or absence of a particular mutation. This pairing allowed us to quantify how the marginal fitness effect of each mutation varied with the fitness of the progenitor background in
tation, such that it benefit in the more fact, the pykF muta in the ancestral bac beneficial mutations ground fitness on t pended on the speci that these mutations fects through differe processes. A conspicuous trajectory for this p most experimental p stant environment— declined over time may explain this de in the number and tations as a populatio its environment (21, relation between ep ness of a genotype (Fig. 3) suggests th tribute greatly to this effect-size of the rem as a population ap other words, epistas the contribution of Note that similar tre (37), who examined five beneficial mut adaptation of an e obacterium extorqu ours, found that four diminishing fitness tion had the opposit Our results are a theoretical study th models to infer tha beneficial mutation trajectory in the sa that we have studied widespread epistatic cial mutations in thi epistasis with anoth arose but did not fix we did not observe possible mutational an earlier study of th generally, our resul simple epistasis fun into models that see adaptation, at least tions evolving unde ever, our results als exceptions to any si by the finding that affected the magnitu the relation between
y maat the usualts on –21).
● ●
●
● ● ● ●●
● ●
● ● ●
r = −0.256 P = 0.339
+topA ● ●
●
● ●
● ● ●● ●
●
r = −0.586 P = 0.017
+spoT ● ● ●
●
●
● ● ● ●
●●
●
r = −0.502 P = 0.048
+glmUS ●
0.0
●
●
● ●
●
●
● ●
●● ●● ● ●
●
0.2
r = −0.499 P = 0.049
ouston, hogénie ut Jean 38041 lecular 48824,
+pykF ●
0.0
●
Fig. 1. Mutational network connecting constructed genotypes. Each node represents one of 32 possible combinations of five mutations. Anc indicates the ancestral strain. Other labels indicate mutations affecting these genes: r, rbs; t, topA; s, spoT; g, glmUS; and p, pykF. Node colors and sizes reflect the
Khan et al, Science 2011
veston,
+rbs
0.0
First 5 mutations in Ara–1 line: rbs operon topA spoT glmUS promoter pykF
fitness change
A common observation in microbial evolution experiments is that the rate of fitness increase tends to decelerate over time (21–25). Negative epistasis, in which the combined effect of beneficial mutations is smaller than would be expected from their separate effects, could explain that tendency. However, such deceleration might instead occur simply because beneficial mutations of large effect will tend to be incorporated earlier owing to their faster spread and greater success in the face of competing beneficial mutations (26), and this explanation does not require epistasis. The capacity to sequence experimentally evolved genomes, as well as to enumerate beneficial mutations over time, adds another dimension to evolutionary dynamics that can inform efforts to understand the role of epistasis in adaptive evolution (24). Kryazhimskiy et al. (4) recently proposed that trajectories for fitness and accumulated beneficial mutations could be jointly analyzed to infer the nature of epistasis among the beneficial mutations. By analyzing these com-
0.2
istatic play tterns many nown s. Do e epiations itness from form s are ations us or, y seistatic ess of binae typhough Nevn sugay be x geexamexist s that
0.0 from www.sciencemag.org 0.2 0.2 0.0 0.2 on June 2, 2011 Downloaded
“Diminishing returns epistasis” between most adaptive mutations
● ● ●● ● ●●
● ●
● ●●
●
●
r = 0.652 P = 0.006 1.0
1.1
1.2
relative fitness of progenitor
1.3
“Diminishing returns epistasis” observed in other systems Fitness effect of knock-out, %
8
gat2∆ whi2∆ sfl1∆ ho∆
6 4 2 0 −2 −2
Kryazhimskiy et al, Science 2014
0 2 4 6 8 Fitness of background strain, %
Number of mutations
Theoretical approach: the distribution of fitness effects
pykF
spoT
topA
Fitness effect of mutation
If two genotypes have identical DFEs they will adapt at the same rate
Decline in rate of adaptation implies “macroscopic epistasis” Number of mutations
Ancestor
pykF
Evolved for 1,000 gen
spoT
topA
Fitness effect of mutation
Measuring DFE directly is extremely difficult
Allele-swapping experiments show “microscopic epistasis” Number of mutations
Ancestor
pykF
Ancestor + topA
spoT
topA
spoT
pykF
Fitness effect of mutation
Microscopic epistasis does not imply macroscopic epistasis
namics; deleterious mutations occur at a higher rate than beneficial mutations, but the resulting
Conclusions
the beneficial effect of the first fixation, 〈s1 〉, is typically ~ 0.1 (1, 9, 10). It follows that the dis-
LTEE
ulation size N, bene initial mean benefici LTEE, N = 3.3 × 107 the daily dilutions an m and a0 are unknow match the best fit t tained the low mutatio The expected values fixation times across a Fig. 3B. The dynam with high beneficial giving 〈s1 〉 ≈ 0:1 and the first fixation, whi tions from the LTEE ( m, adaptation becom beneficial mutations, consistent with the L dicts that the rate of a sharply than the rate S5), which is qualitati tions (10, 11). The mo beneficial mutations s “cohorts” of benefici especially at high m (1 inferred role of dimin population mean-fitn by this complication, ponent is independen verified by numerical beneficial mutations on long-term fitness t parameters considered Six populations e
•
First few adaptive mutations exhibit primarily diminishing returns epistasis (microscopic)
•
Adaptation decelerated due to changes in DFE (macroscopic epistasis)
•
Does observed microscopic epistasis account for all changes in DFE? Unclear
Wiser et al, Science 2013; Good et al, Genetics 2015
Strange (and interesting) things happen in evolution experiments
Before generation 33,000
After generation 33,000