Modular  reweighting  

  1  

                     

Modular  reweighting  users  manual   V1.0       Written  by:   Dr.  Daniel  J.  Sindhikara     (currently  at  Institute  for  Molecular  Science,  Okazaki,  Japan)     [email protected]  

      Please  cite  as:  Sindhikara,  Daniel  J.  “Modular  reweighting  software  for  statistical   mechanical  analysis  of  biased  equilibrium  data”,  Computer  physics  communications,   xx  xxx                            

Modular  reweighting  

  2  

       

   

Contents:  

    1. Why  use  modular  reweighting?     2. Reweighting  algorithm         3. Current  module  capabilities     4. Potential  improvements    

 

 

 

 

3    

 

 

 

4  

 

 

 

4  

 

 

 

6  

5. Modular  reweighting  workflow               6. Examples                                 a. Replica  Exchange  Molecular  Dynamics  in   AMBER                 b. Umbrella  sampling  (REUS)  in  AMBER       7. References                                        

 

7 8 8   11   14

Modular  reweighting  

  3  

  1. Why  use  modular  reweighting?   a. You  want  to  reweight  data.  

 

 

If   you   run   any   equilibrium   enhanced   sampling   algorithm   such   as   replica   exchange(1),   umbrella   sampling(2)(3),   multicanonical   method(4),   etc.,   you   can   use   reweighting  to  optimize  expectation  values,  and  extract  thermodynamic  quantities.     b. You  want  it  to  use  a  modular  program.   Most  people  either  have  to  write  an  entire  reweighting  program  for  themselves   or  have  to  work  within  the  confines  of  the  algorithms  for  which  software  is  already   available.  Modular  reweighting  allows  you  to  design  your  own  algorithm  and  use  the   same   reweighting   engine.   You   can   even   use   an   old   algorithm   and   a   new   reweighting   engine.       c. You  don’t  enjoy  dependency  hell.   One   of   my   biggest   problems   with   software   packages   is   the   heavy   reliance   on   other   software   packages.   You   try   to   install   one,   and   you   realize   you   have   to   install   something   else.   But   to   get   that   to   work   you   have   to   install   something   else.   The   programs  in  modular  reweighting  rely  on  commonly  preinstalled  packages  on  most   linux  or  macintosh  systems.  Thus,  no  dependency  hell.       d. You  use  REMD  or  umbrella  sampling  in  AMBER.   Scripts   are   already   available   for   these   two   sampling   algorithms   (in   AMBER).   So   you  don’t  need  to  do  much  to  get  started  reweighting.     e. You  don’t  use  REMD  or  umbrella  sampling  in  AMBER.   Using   a   different   software   package?   Made   your   own   secret   algorithm   that   no   one  knows  how  to  reweight  yet?  Write  a  small  script  (such  as  the  perl  scripts  in  this   package)   to   convert   the   data   to   reduced   weights   (log   probabilities)   then   feed   it   in   to   WRE  and  go.  An  online  database  of  new  preprocess  scripts  will  be  made  available.     f. You  don’t  want  it  to  crash.   One  problem  with  available  reweighting  packages  is  the  numerical  instability  due   to   large   probability   ranges.   The   WHAM   engine   in   modular   reweighting   treats   probabilities  in  log  space  (using  a  log  trick)  to  keep  it  stable.  

Modular  reweighting  

  4  

 

2. Reweighting  algorithm  

MR   currently   uses   WHAM   as   a   reweighting   algorithm;   Please   see   reference   (5)   and/or   accompanying   manuscript   for   details.   Note.   The   current   implementation   does   not   account   for   gn.   That   is,   it   takes   gn   as   1,   assuming   0   autocorrelation   time.   This  can  be  a  valid  assumption  if  the  input  data  points  are  taken  at  a  frequency  less   than   the   q   autocorrelation   time   (DIY).   Further   implementations   may   calculate   and   include  this  explicitly.      

3. Current  module  capabilities    

1D  Generalized  WHAM  engine:  “WRE.cpp”     Input:  1D  coordinate  and  biased  ensemble  series  “reduced  weight”   Output:  log  probability  wrt  coordinate.   Requires:   (c   compiler,   math.h,string.h,   stdio.h,stdlib.h),   compile   with   your   favorite   c++  compiler  (e.g.  g++    g++ WRE.cpp –o WRE )   Warning:  some  systems  will  get  undefined  reference  depending  on  your  version  of       Input  format:        

  where  the  reduced  energy  file  is  formatted  as:  

           ...  

  in  pseudocode:  

  print   “%d   %f   %f   %f   %f   %f\n”,myindex,myq,-­‐myq/(temp[0]*KB),-­‐myq/(temp[1]*KB),-­‐ myq/(temp[2]*KB),-­‐myq/(temp[2]*KB);  

  it  may  look  something  like:  

  0  -­‐17.500000  34.265995  29.354536  23.450485  19.269936   0  -­‐25.210000  49.362613  42.287305  33.782099  27.759719   0  -­‐23.760000  46.523431  39.855072  31.839059  26.163067   …   3  -­‐15.970000  31.270168  26.788111  21.400243  17.585193   3  -­‐18.400000  36.028246  30.864198  24.656510  20.260961   3  -­‐20.320000  39.787715  34.084809  27.229363  22.375148  

  Module  features:      

Log   space   treatment   of   weights   allows   for   numerical   stability   at   high   energy(weight)   ranges.   (As   suggested  by  Okamoto,(6)  Berg)(7)   Convergence  correction  for  oscillating  convergers.  

Modular  reweighting  

  5  

Language:  c++/c  

   

Log  probability  post-­‐processing:  “analyzeweight.py  /  analyzeweight.py2”     Input:  log  probability  wrt  coordinate  “q”  (from  WRE,  directly  or  modified))   Additional  input:  1,  2,  or  3D  coordinate  time  series  (at  least  one  must  be  the  same   as  the  log  probability  coordinate  –  q(t)  a(t),  q(t)  a(t)  b(t))   Output:     Print  1D  distribution,  p(a)   Print  2d  distribution  p(q,a)   print     print  2d  distribution  p(a,b)     Language:  Python  3.x,  2.6x(analyzeweight.py2)    

Premade  reduced  weight  scripts:    

Note:   These   are   designed   custom   for   the   type   of   biasing   in   the   particular   simulation.   Users  are  encouraged  to  write  their  own  scripts  for  this  whenever  necessary.   convertREMDtoreducedenergies.pl   Input:  T-­‐REMD  rem.log     Output:  “reduced  weight”  file  for  WRE     convertNCSU-­‐UStoreducedenergies.pl   Input:  metafile  naming  umbrella  time  series  and  parameters,  optional  periodicity   Output:  “reduced  weight”  file  for  WRE        

Modular  reweighting  

  6  

   

4. Potential  improvements   a. Explicit  error  analysis  

 Something  like  bootstrapping  would  be  nice  to  add  here.     b. Explicit  analysis  and  treatment  of  autocorrelation  function  for  q   This  should  be  simple  enough  to  add  in.  Possibly  a  simple  an  autocorrelation  pre-­‐ filter  type  thing  would  work  too.     c. More  “reduced  energy  scripts”   I  really  hope  the  script  library  gets  big  for  the  sake  of  beginners,  but  advanced   simulators  should  have  no  problem  writing  their  own  scripts  as  they  need.     d. MBAR     At  some  point  I’d  like  to  implement  MBAR  from  Shirts  and  Chodera.(8)  It  eliminates   histogram  error  and  has  analytical  error  analysis.  But  the  error  doesn’t  seem  to  be   that  much  lower  than  WHAM.  The  bottleneck  is  surely  sampling.     e. More  examples.   I  plan  on  adding  these  steadily.  Hopefully  some  users  can  make  some  too.       If  you’d  like  to  contribute  any  scripts  please  let  me  know.  The  caveat  is  can  you   please  use  a  low-­‐dependency  program/script,  and  if  possible  include   documentation/examples.  

Modular  reweighting  

  7  

 

5. Modular  reweighting  workflow    

 

 

Modular  reweighting  

  8  

 

  6. Examples  

Note:  The  preprocess  and  postprocess  scripts  are  interpreted  and  thus  need  perl  or   python  to  execute.  The  WHAM  engine  is  written  in  c/c++  and  can  be  compiled  easily,   for   example:   g++   WRE.cpp   –o   WRE.   Examples   a   and   b   are   converged   simulations.   Unfortunately  some  datafile  are  too  large  to  include  (such  as  trajectory  files)  so  they   are  not  included  (though  the  inputs  are).       A   complete   (unconverged)   version   of   example   a   is   included   in   directory:   examples/REMD-short-complete. By  “complete”  I  mean  that  it  includes   the  entire  simulation,  reweighting,  analysis,  and  actual  executable  bash  scripts.  It  is  a   shorter,   smaller   version   than   example   a   (for   sake   of   filesize).   This   example   may   be   useful  for  testing  the  software  and  seeing  extremely  detailed  workflow.  One  should   refer  to  the  examples  below,  however,  to  get  an  idea  of  the  bigger  picture.   The   complete   example   is   broken   up   into   directories   based   on   steps   using   numbering  to  denote  order.  Steps  1  and  2  are  run  in  AMBER  10+  in  sander.MPI  and   ptraj   respectively.   Steps  3  and   4  use  modular  reweighting  programs  to  analyze  the   AMBER   data   (Step   3   uses   a   preprocess   script   and   the   “WRE”   WHAM   red   engine,   and   step   4   uses   the   postprocess   script   analyzeweight.py).   Step   5   uses   gnuplot   to   show   the  difference  between  the  unreweighted  and  reweighted  data.  The  software  in  this   package   will   only   let   you   perform   steps   3   and   4.   Please   read   example   a   to   understand   the   theory,   and   see   the   “runme.sh”   bash   scripts   to   see   how   the   examples  were  performed.  

 

a. REMD  (energy-­‐based  reweighting)  

examples/REMD-long     In   replica   exchange   multiple   discrete   ensembles   use   energy-­‐based   biasing   (Boltzmann).   Thus   energy   is   our   “q”.     We   use   an   Alanine   dipeptide   REMD   simulation   using  files  in  the  input  directory.     Next   we   need   to   prep   a   reduced   energy   file   for   the   WHAM   reduced   (energy)   engine   (WRE).   We   can   do   this   simply   by   applying   the   perl   script:   convertREMDtoreducedenergies.pl   to   the   rem.log   file.   This   will   likely   be   the   most   time-­‐consuming  part  of  the  process.    When  that  is  done,  we  run  WRE  by  a  command   such  as:    

WRE redinputfile 0.5 0.00001  

(bin   size   of   0.5,   tolerance   of   0.00001).     The   resulting   file   “lnPQ”   is   the   unbiased   probability   of   Q   (here,   U),   in   the   case   of   energies,   this   is   commonly   called   the  

Modular  reweighting  

  9  

energetic  density  of  states.  In  order  to  reweight  to  a  canonical  temperature,  we  can   simply  use  an  awk  1-­‐liner.  To  make  the  300K  lnPQ:      

awk ‘{printf “%f %f\n”,$1,$2-$1/(300*0.0019872)}’ lnPQ > lnPQ300K   Here  is  what  the  300K,  457K  and  unnormalized  distributions  should  look  like:  

   

 

But   this   information   is   quite   useless   by   itself.   We   want   to   get   some   probability   distributions   for   some   quantity,   A.   To   do   this,   we   need   to   be   able   to   associate   the   Q   values   to   A.   Make   a   two   column   time   series   file    
.   See   phivsE.dat.  Be  VERY  careful  that  you  have  the  time  series  files  matched.  If  they  are   off,  the  data  will  be  garbage.     We  will  now  use  this  correlation  between  the  phi  angle  and  E,  A(E),  to  get  the   canonical  distributions.  For  the  300K  distribution  run,  execute:    

analyzeweight.py lnPQ300K

Pick  the  option  for  p(A),  and  input  phivsE.dat,     Here’s   what   the   distributions   for   300K   and   457K   look   like   compared   to   the   temperature  histograms:  

Modular  reweighting  

  10  

In  a  similar  manner,  you  can  get  2D  distributions:      

 

Modular  reweighting  

  11  

 

  b. REUS  (Replica  exchange  umbrella  sampling)    

examples/US       Here,   I   show   an   example   of   REUS.   If   you   are   unfamiliar   with   the   replica   exchange   version   of   REUS,   see   the   original   reference   (9),   and   know   that   the   umbrella   trajectories   from   REUS   can   be   treated   identically   to   those   of   umbrella   sampling.   Please   see   the   input   files   in   examples/US/inputs.   This   simulation   is   a   16   umbrella   REUS  simulation  in  carbon-­‐alpha  radius-­‐of-­‐gyration  space.       The  raw  output  can  be  converted  to  reduced  energies  using  the  script:    

convertNCSU-UStoreducedenergies.pl metafile

The  format  of  the  metafile  is  seen  in  the  example  file  and  is  also  displayed  when  the   script   is   run.   Simply,   it   requires   the   filenames   of   the   time   series   “umbrella”   files   along  with  the  umbrella  centers  and  spring  constants.  This  output  is  fed  in  to  WRE   using  the  command:  

WRE redinputfile 0.5 0.00001

The  output,  “lnPQ”  file  can  be  used  to  see  the  free  energy  or  probability  distribution   along   the   “umbrella”   coordinate.   Here   is   what   it   looks   like,   compared   to   the   raw   histogram  of  the  umbrella  coordinate  from  the  simulation:  

Modular  reweighting  

  12  

 

 

Note   how   drastically   these   plots   differ.   Most   often,   umbrella   sampling   is   used   to   get   the  free  energy  along  the  coordinate.  If  this  is  the  case,  one  could  calculate  it  directly   from  the  lnPQ  file  above  (by  multiplying  the  second  column  of  the  lnPQ  file  by   −k B T .   If   the   probability   distribution   of   some   coordinate(s)   is   needed,   following   a   similar   procedure   to   the   REMD   example   above,   a   time   series   of   that   coordinate   with   the   umbrella  coordinate  is  required:  rogvsphi5.dat   € Running  analyzeweight.py lnPQ   One  can  get  the  unbiased  distribution  (in  this  case  in  the  dihedral  angle  phi  of  the  5th   residue).   The   figure   below   shows   how   the   difference   between   this   and   the   raw   histogram  is  much  more  subtle.  

Modular  reweighting  

  13  

     

 

Modular  reweighting  

  14  

   

7. References:     (1)     (2)     (3)     (4)     (5)     (6)     (7)     (8)     (9)      

Sugita,  Y.;  Okamoto,  Y.  Chemical  Physics  Letters.  1999,  314,  141-­‐151.   Torrie,  G.  M.;  Valleau,  J.  P.  Journal  of  Computational  Physics.  1977,  23,  187-­‐ 199.   Torrie,  G.  M.;  Valleau,  J.  P.  Journal  of  Chemical  Physics.  1977,  66,  1402-­‐1408.   Berg,  B.;  Neuhaus,  T.  Physical  Review  Letters.  1992,  68,  9-­‐12.   Ferrenberg,  A.;  Swendsen,  R.  Physical  Review  Letters.  1989,  63,  1195-­‐1198.   Okamoto,  Y.  Journal  of  Molecular  Graphics  &  Modelling.  2004,  22,  425-­‐39.   Berg,  B.  Computer  Physics  Communications.  2003,  153,  397-­‐406.   Shirts,  M.  R.;  Chodera,  J.  D.  Journal  of  Chemical  Physics.  2008,  129,  124105-­‐ 124110.   Sugita,  Y.;  Kitao,  A.;  Okamoto,  Y.  Journal  of  Chemical  Physics.  2000,  113,  6042-­‐ 6051.    

 

Modular reweighting users manual

mechanical analysis of biased equilibrium data”, Computer physics .... I really hope the script library gets big for the sake of beginners, but advanced simulators ...

701KB Sizes 1 Downloads 287 Views

Recommend Documents

VABS Manual for Users
9 Jan 2012 - The 3D pointwise displacement/strain/stress distribu- tion within the structure can also be recovered based on the global behavior of the 1D beam analysis. Since most of the theoretical details are presented in pertinent papers and colle

VAMUCH Manual for Users
Mar 1, 2012 - Uses a simplified license mechanism. 6. Has a more detailed manual for end users and a manual for developers. 7. Adopts gfortran as the compiler to create executables for multiple operating systems including. Windows, Linux, and Mac. La

MDR2400-EHD User Manual Users Manual Pages twenty ... - FCC ID
Historic AIS. Page 40 Issue 1. 862-015.45 uLink System Manual. Indoor Unit (IU) Rear Panel. Controls,. Indicators and Figure 11 shows all items on the IU Rear Panel. Connectors Table 7 describes the items shown in the illustration. Outdoor Unit DC Po

MDR2400-EHD User Manual Users Manual Pages twenty ... - FCC ID
module mates firmly with the female connector inside the IU. - Secure the Data Interface module to the IU. Slide the IU into the 19" rack and secure to the. 3.

Importance Reweighting Using Adversarial-Collaborative Training
One way of reweighting the data is called kernel mean matching [2], where the weights over the training data are optimized to minimize the kernel mean discrepancy. In kernel meaning matching, ..... applications and. (iii) theoretical analysis. 5 ...

CS1810xx-CS4961xx-and-CM2-Hardware-Users-Manual-UM-qorno.pdf
©Copyright 2005 Cirrus Logic, Inc. JUN '05. DS651UM23 http://www.cirrus.com. Digital Audio NetworkingProcessor. CS1810xx, CS4961xx, & CM-2. Preliminary Product Information This document contains information for a new product. Cirrus Logic reserves t

modular
proprietary cable management system cooling system extensive I/O. The V4n Micro ... technical needs while easily morphing into any company's visual identity.

PDF Download Paperwhite Users Manual: The Ultimate ...
Unlimited Free Books Full Books. Books detail. Title : PDF ... Use Amazon's free "Cloud" service for unlimited storage of your digital content. > Find the best free ...

wire rope users manual 4th edition pdf
Whoops! There was a problem loading more pages. Retrying... wire rope users manual 4th edition pdf. wire rope users manual 4th edition pdf. Open. Extract.

modular design.pdf
There was a problem previewing this document. Retrying... Download. Connect more apps... Try one of the apps below to open or edit this item. modular design.

Modular implicits
Aug 2, 2014 - Implicits in Scala [3] provide similar capabilities to type classes via direct support for type-directed im- plicit parameter passing. Chambart et al.

MODULAR PICS.pdf
Page 3 of 10. MODULAR PICS.pdf. MODULAR PICS.pdf. Open. Extract. Open with. Sign In. Main menu. Displaying MODULAR PICS.pdf. Page 1 of 10.

Modular decking planks
Sep 26, 2003 - (58) Field Of Classi?cation Search .................. .. 52/177 a toP Pomona a bottom Pomona and ?rst and Second Side. 52/180, 181, 574, 579, 480, ...

Modular Origami Polyhedra.pdf
Modular Origami Polyhedra.pdf. Modular Origami Polyhedra.pdf. Open. Extract. Open with. Sign In. Main menu. Displaying Modular Origami Polyhedra.pdf.

Small Modular Biopower Systems - NREL
Small modular biopower systems can help supply electrical power to the more than 2.5 ... require the companies to participate at a higher level. A synopsis of the ...

Modular Robotic Vehicle: Project Overview
Apr 7, 2015 - Mobility: Common Themes. • Safety is paramount. – Getting crew home is top priority in space. – Translates to earth. – Functional redundancy.