Note: Network random walk model of two-state protein folding: Test of the theory Alexander M. Berezhkovskii, Ronan D. Murphy, and Nicolae-Viorel Buchete Citation: J. Chem. Phys. 138, 036101 (2013); doi: 10.1063/1.4776215 View online: http://dx.doi.org/10.1063/1.4776215 View Table of Contents: http://jcp.aip.org/resource/1/JCPSA6/v138/i3 Published by the American Institute of Physics.

Additional information on J. Chem. Phys. Journal Homepage: http://jcp.aip.org/ Journal Information: http://jcp.aip.org/about/about_the_journal Top downloads: http://jcp.aip.org/features/most_downloaded Information for Authors: http://jcp.aip.org/authors

THE JOURNAL OF CHEMICAL PHYSICS 138, 036101 (2013)

Note: Network random walk model of two-state protein folding: Test of the theory Alexander M. Berezhkovskii,1 Ronan D. Murphy,2,3 and Nicolae-Viorel Buchete2,3,a) 1

Mathematical and Statistical Computing Laboratory, Division of Computational Bioscience, Center for Information Technology, National Institutes of Health, Bethesda, Maryland 20892, USA 2 School of Physics, University College Dublin, Belfield, Dublin 4, Ireland 3 Complex and Adaptive Systems Laboratory, University College Dublin, Belfield, Dublin 4, Ireland

(Received 3 October 2012; accepted 2 January 2013; published online 18 January 2013) [http://dx.doi.org/10.1063/1.4776215] Single-exponential kinetics of an isomerization reaction A  B implies that the equilibrium distribution of the molecule has two peaks localized in different parts of the phase space (two basins). Transitions between the two basins occur via a sparsely populated area, the so-called “bottleneck” region. Although the region does not manifest itself in the thermodynamics, dynamics of the system in this region determines the rate constants of the forward and reverse reactions. These rate constants are given by the Kramers ratio of the reactive flux at equilibrium to the equilibrium populations of A and B states of the molecule. A distinctive feature of the single-exponential kinetics is that there is some freedom in the choice of the boundaries separating the basins and the bottleneck region; both the reactive flux and the equilibrium populations are weakly sensitive to the exact location of the boundaries in some range of their positions. Two-state protein folding described by the kinetic KF −− −− → scheme U − ← − F is an example of the isomerization reKU

action. Here U and F denote unfolded and folded states of the protein, while kF and kU are the folding and unfolding eq rate constants given by kU,F = J /PF,U , where J is the reeq active flux at equilibrium and PF,U are equilibrium populations. An expression for J has been recently derived1 assuming that the underlying protein dynamics is described in terms of Markovian transitions on a network of complex connectivity, which is formed by coarse-grained microstates of the protein. To test this expression, we apply it to a toy model of protein dynamics, which has an important advantage: it allows for an approximate analytical solution for the sum of the rate constants.2 Here, we show that the expression for the reactive flux, which follows from this solution, is identical to the expression obtained using the formula derived in Ref. 1. Let pi (t) be the probability of finding the protein in microstate i, i = 1, 2, . . . , N, where N is the number of microstates. This probability satisfies dp(t) = Kp(t), (1) dt where the evolution operator K is the N × N rate matrix, in which the non-diagonal matrix element Kij is the rate cona) E-mail: [email protected].

0021-9606/2013/138(3)/036101/2/$30.00

stant for transition j → i (i = j) and the diagonal matrix element Kjj is the rate constant for escape from microstate j, Kjj = − i=j Kij . The elements of the rate matrix satisfy the eq eq eq condition of detailed balance Kij pj = Kj i pi , where pi is the equilibrium probability of finding the protein in microstate i, which satisfies Kpeq = 0. Eigenvalues −λn and their corresponding eigenvectors ϕ (n) are solutions of the eigenvalue problem Kϕ (n) = −λn ϕ (n) , n = 1, 2, . . . , N. The smallest in magnitude eigenvalue is zero, while all other eigenvalues are negative and can be ordered in increasing values of their magnitudes, λ1 = 0 < λ2 < λ3 ≤ . . . ≤ λN . When the master equation, Eq. (1), describes stochastic dynamics that underlies two-state protein folding, λ2 is the sum of the folding and unfolding rate constants, λ2 = kF + kU eq eq = J /(PU PF ). The main result of Ref. 1 is a formula giving the reactive flux in terms of the splitting (commitment) probabilities in the bottleneck region (see also Ref. 3). In deriving the formula, all microstates are divided into three groups: definitely unfolded (UU), definitely folded (FF), and intermediate (I) or bottleneck microstates. The latter separate the UU- and FF-microstates in the sense that all transitions between them occur only through the intermediate microstates. The splitting probability pfold (i) of microstate i is defined as the probability of reaching the FF-basin before reaching the UU-basin starting from microstate i, so that pfold (i) = 1 for i ∈ FF, pfold (i) = 0 for i ∈ UU, and 0 < pfold (i) < 1 for i ∈ I. As shown in Ref. 1, the reactive flux through an arbitrary dividing surface  in the bottleneck region, which is necessarily crossed by any trajectory connecting the FF- and UU-basins, is given by  eq Kij pj [pfold (i) − pfold (j )], (2) J = i∈F ∗ j ∈U ∗

where F∗ and U∗ are the sets of microstates located on the Fand U-sides of . Here, we use the formula in Eq. (2) to derive an explicit expression for the reactive flux assuming that (i) the protein eq eq is at its melting temperature, i.e., PU = PF = 1/2 and (ii) its dynamics is described by the simple model proposed in Ref. 2. This model is an extension of the model suggested by Zwanzig, Szabo, and Bagchi to study Levinthal’s paradox.4 The model assumes that the protein contains M identical

138, 036101-1

© 2013 American Institute of Physics

036101-2

Berezhkovskii, Murphy, and Buchete

J. Chem. Phys. 138, 036101 (2013)

residues, each of which can be in either a folded (f) or an unfolded (u) state. A microstate of the protein is completely characterized by indicating the state of each residue. Therefore, the total number N of the microstates is 2M . In addition, it is assumed that transitions among microstates occur as a result of independent transitions between f- and u-states of individual residues described by the kinetic scheme k −−− −− → u← − f , where the rate constant k is assumed to be the same k for transitions in both directions. The model introduces a collective behavior of the protein responsible for its folding by postulating that the first and second residue unfolding events occur with modified rate constants ε0 k and ε1 k, respectively, ε0 , ε1  1. The model also postulates that (i) the FF-basin contains the only microstate in which all residues are in their f-states, while the UU-basin contains microstates with two and more unfolded residues, and (ii) microstates containing the only unfolded residue form a transition state ensemble (TSE), in the sense that transitions from these microstates to microstates with two and zero unfolded residues occur with probability 1/2. This leads to M −1 , ε0 = M 2 + (M − 1)2 − 2

1 ε1 = . M −1

(3)

where δ mn is the Kronecker delta. For this model of protein dynamics, an approximate solution for λ2 , λ2 = M(M − 1)k/[2M + (M − 1)2 − 2], was found assuming that the protein was at its melting temperature.2 Comparison with λ2 obtained by numerical diagonalization of the rate matrix shows very good agreement between the analytical and numerical results for M ≥ 20.2 Since at the melting temperature J = λ2 /4, we have M(M − 1)k . 4[2M + (M − 1)2 − 2]

(5)

Below, we obtain this result from Eq. (2). This is done in two steps. First, by taking advantage of the fact that microstates with the same number of folded/unfolded residues are indistinguishable, we reformulate the problem in terms of effective microstates, which are unifications of the indistinguishable initial microstates. Then, we find J for the reformulated problem. Let Pm (t) be the probability of finding the protein in one of the microstates with m unfolded residues and M − m folded residues, m = 0, 1, 2, . . . , M. It can be shown that probabilities Pm (t) satisfy the master equation k−1 dP(t)/dt = Keff P(t), where Keff is the (M + 1) × (M + 1) three-diagonal rate maeff trix, with non-diagonal matrix elements given by Kj −1 j = j , eff

eff

eff

eff

eff

l∈F ∗ m∈U ∗

This expression is an equivalent of Eq. (2) formulated in terms of the initial microstates. In our model, the FF-basin contains the only effective microstate, in which all residues are in their folded states. The UU-basin contains effective microstates with at least two unfolded residues. The TSE is formed by the only effective microstate, which is a unification of all initial microstates with the only unfolded residue. There are two possibilities to draw the dividing surface  in such a system, between the TSE and (i) the FF-basin or (ii) the UU-basin. In the first case, we have eff

The equilibrium probability of finding the protein in a mieq crostate with m unfolded residues Pm is given by   [ε0 + (1 − ε0 )δm0 ][ε1 + (1 − ε1 )δm1 ] M Pmeq = , (4) m 2[1 − (1 − ε1 )δm0 ]

J =

eff

are Kj j = −(Kj −1 j + Kj +1 j ). Assuming that the dividing surface  separates microstates with the same number of folded/unfolded residues, we can write an expression for the reactive flux in terms of effective microstates using the equieq librium distribution Pm and the corresponding splitting probabilities Pfold (m),  eff J =k Klm Pmeq [Pfold (l) − Pfold (m)]. (6)

j = 1, 2, . . . , M, K1 0 = ε0 M, K2 1 = 1, Kj +1 j = M − j , j = 2, 3, . . . , M − 1, and the diagonal matrix elements

eq

J = kK01 P1 [Pfold (0) − Pfold (1)]. eff

eq

eff

(7)

eq

Here, K01 P1 = K10 P0 = Mε0 /2, Pfold (0) = 1, and Pfold (1) = 1/2, so that the flux is J = kMε0 /4. Substituting ε0 here, given in Eq. (3), we recover the expression for the flux in Eq. (5). In the second case, Eq. (6) leads to eff

eq

J = kK12 P2 [Pfold (1) − Pfold (2)], eff

eq

eff

(8)

eq

where K12 P2 = K21 P1 = Mε0 /2, Pfold (1) = 1/2, and Pfold (2) = 0. Substituting these into Eq. (8) and using ε0 in Eq. (3), we again recover the result in Eq. (5). To summarize, we have demonstrated that the expression for the reactive flux given in Eq. (5) for the toy model of the protein dynamics can be obtained from the Kramers-type formula for the reactive flux, Eq. (2), derived under the assumption that the protein dynamics is described by a Markov random walk on a network of complex connectivity. We are grateful to Gerhard Hummer and Attila Szabo for helpful discussions. This study was supported by the Intramural Research Program of the National Institutes of Health, Center for Information Technology, and by the Irish Research Council. 1 A.

Berezhkovskii, G. Hummer, and A. Szabo, J. Chem. Phys. 130, 205102 (2009). 2 A. M. Berezhkovskii, F. Tofoleanu, and N.-V. Buchete, J. Chem. Theory Comput. 7, 2370 (2011). 3 Ch. Schutte, F. Noe, E. Meerbach, P. Metzner, and C. Hartmann, in Proceedings of the 6th International Congress on Industrial and Applied Mathematics, Zurich, Switzerland, 15–20 July, 2007, edited by I. Jeltsch and G. Wanner (European Mathematical Society Publishing House, 2009), pp. 297–335; P. Metzner, Ch. Schutte, and E. Vanden-Eijnden, Multiscale Model. Simul. 7, 1192 (2009). 4 R. Zwanzig, A. Szabo, and B. Bagchi, Proc. Natl. Acad. Sci. U.S.A. 89, 20 (1992); R. Zwanzig, ibid. 92, 9801 (1995); 94, 148 (1997); N. V. Buchete and J. E. Straub, J. Phys. Chem. B 105, 6684 (2001).

Network random walk model of two-state protein folding

Jan 18, 2013 - View online: http://dx.doi.org/10.1063/1.4776215. View Table of ... Information Technology, National Institutes of Health, Bethesda, Maryland 20892, USA. 2School of ... the master equation, Eq. (1), describes stochastic dynam-.

290KB Sizes 0 Downloads 196 Views

Recommend Documents

protein folding pdf
There was a problem loading more pages. protein folding pdf. protein folding pdf. Open. Extract. Open with. Sign In. Main menu. Displaying protein folding pdf.

A random resistor network model of voltage trimming
Jul 14, 2004 - by considering random resistor network models with different properties and ... for polysilicon resistors [1] or 20–30% for thick-film resistors. [2].

Protein folding by motion planning
Nov 9, 2005 - Online at stacks.iop.org/PhysBio/2/S148 .... from the roadmap using standard graph search techniques ... of our iterative perturbation sampling strategy shown imposed on a visualization of the potential energy landscape. 0. 5.

Protein Folding Paper - Ghosh.pdf
Download. Connect more apps... Try one of the apps below to open or edit this item. Protein Folding Paper - Ghosh.pdf. Protein Folding Paper - Ghosh.pdf. Open.

ARE THERE PATHWAYS FOR PROTEIN FOLDING ?
A second approach involved the use of computer- ... display system, the molecule thus generated can be ... Finally, the computer system has been used in at-.

The Protein Folding Problem
In 1994, John Moult invented CASP (Critical. Assessment of Techniques ..... (Left) The density of states (DOS) cartoonized as an energy landscape for the three-helix bundle protein. F13W∗: DOS (x-axis) ... The peak free energy (here, where the DOS

Protein Folding and Ligand-Enzyme Binding from Bias ...
In Section II, the convergence criteria, and the methods for analyzing data to ..... [19] a formalism was introduced which allows to map the history-dependent.

random walk in science pdf
There was a problem previewing this document. Retrying... Download. Connect more apps... Try one of the apps below to open or edit this item. random walk in ...

Protein Folding and Ligand-Enzyme Binding from Bias ...
Keywords: Enhanced sampling, Free energy calculations, Protein folding, Ligand-enzyme binding, ..... history dependent potential according to Eq. 1, but then sets. 0. 50. 100. 150 ...... LysM domain using a coarse-grained model. J. Phys.

Entropy of random walk range on uniformly transient ... - Project Euclid
Let G be a uniformly transient graph with bounded degrees. Then lim inf ... In addition, let |∂eB(x, r)| be the number of vertices at distance r + 1 from x ∈ V with respect to the usual .... The cardinality of any set A is denoted by |A|, the lar

Weak convergence of a random walk in a random ...
resolvent corresponding to n., then R~tk. = nEt~n . If we consider R. as a map from. La(T~) to L~(T~), then Lemma 4 states that the map is bounded by cln 2.

A Random-Walk Based Scoring Algorithm with ...
is adapted according to the customer's behavior. Scalabil- ... tion is crucial in order to offer a service that is appreciated and used by .... displays the distribution of the ratings in a training data set and in the ..... Future research topics in

PdF A Random Walk Down Wall Street
... essay writing and custom writing services provided by professional academic writers The Intelligent Investor Rev Ed Collins Business Essentials Kindle edition ...

A Factor-Graph-Based Random Walk, and its ...
that the code C is used for data transmission over a binary- input memoryless ..... directed edge d can feed into. .... Moreover, we fix some vector ξ ∈ RT+1. 李0 .

LANGUAGE MODEL ADAPTATION USING RANDOM ...
Broadcast News LM to MIT computer science lecture data. There is a ... If wi is the word we want to predict, then the general question takes the following form:.

A Random-Walk Based Scoring Algorithm applied to ...
Such a trend is confirmed by the average of the ..... Then micro DOA is something like a weighted averaging of individual DOA values. .... ference on Security and Privacy. (May 2002) ... Internet Technology 5(1) (February 2005) 92–128. 35.

RaWMS - Random Walk based Lightweight Membership Service for ...
Additional Key Words and Phrases: Ad Hoc Networks, Membership service, ... view of every node is independent of the locations of the peers in the network. 1.

ItemRank: A Random-Walk Based Scoring Algorithm for ...
compared ItemRank with other state-of-the-art ranking tech- niques (in particular ... on Bayesian networks [Breese et al., 1998], Support Vec- tor Machines [Grcar et .... where ωq is the out-degree of node q, α is a decay factor3. ..... In IEEE Con

Near-Optimal Random Walk Sampling in Distributed ...
in a continuous online fashion. We present the first round ... at runtime, i.e., online), such that each walk of length l can ... Random walks play a central role in computer science, ..... S are chosen randomly proportional to the node degrees, then