K. Sampling of energy states distribution

By definition, the magnitudes required to calculate a mean squarred error (between a theoretical prediction and an experimental measurement) must be the same. Therefore, we must be sure that the experimental FDC ( $ f_i^{\textrm{exp}}$) that enters Eq. 5.2 is the same magnitude than the theoretical FDC ( $ f_i^{\textrm{the}}$). According to our calculations, the theoretical FDC is in equilibrium. How do we know that the experimental (i.e., measured) FDC is an estimation of the equilibrium FDC too? This appendix describes how the unzipping experiments at low pulling rate are capable of sampling the entire distribution (or at least the most significant part of it) of energy states of the DNA molecule, which is a prerequisite for calculating ensemble averages, partition functions and thermodynamic parameters such as the equilibrium FDC.

Let us discuss the states of DNA that have higher energy than the minimum. In unzipping experiments, the position of the unzipping fork (see Fig. K.1a) exhibits thermally induced fluctuations in such a way that the system can explore higher free energy states. Such fluctuations represent the first kind of excitations in the system and will be discussed in the next paragraphs. However there is a second kind of excitation: breathing fluctuations. The breathing is the spontaneous opening and closing of base pairs produced in the dsDNA, far away from the unzipping fork (see Fig. K.1b). During this process, the DNA explores states of higher free energy while the unzipping fork is kept at the same position. So the breathing does not induce any change in the position of the unzipping fork. Consequently, we are not able to distinguish the breathing in our unzipping experiments because breathing fluctuations are not coupled to the reaction coordinate that we measure, i.e., the molecular extension, and should have a small effect on the measured FDC. Note that breathing fluctuations are expected to be relevant only at high enough temperatures. While the inclusion of breathing fluctuations should be considered at high enough temperatures their contribution at 25$ ^{\circ}$C is expected to be minimal. The fact that our model reproduces very well the experimental FDC supports this conclusion.

Figure K.1: (a) Opening fork. (b) Breathing.

Now let us focus on the fluctuations of the unzipping fork. The unzipping of DNA is performed so slow that the system has enough time to reach the equilibrium at every fixed distance along the pulling protocol. Figure K.2a shows a fragment of the FDC in a region where 3 states (each one with a different number of open base pairs) of the 2.2 kbp molecule coexist ($ n_1=1193$, $ n_2=1248$ and $ n_3=1300$). Figure K.2b shows the hopping in force due to the transitions that occur between these 3 states. The slow pulling rate guarantees that the hopping transitions are measured during unzipping (i.e., many hopping events take place while the molecule is slowly unzipped). The filtering of the raw FDC data at 1 Hz (black curve in Fig. K.2a) produces a reasonably good estimation of the equilibrium FDC. It is also important to remark that the unzipping and rezipping curves are reversible (see Fig. 3.11c). This supports the idea that the unzipping process is quasistatic and correctly samples the energy states.

Figure K.2: Coexistence of states. (a) Left panel shows the measured FDC for the 2.2 kbp sequence. Right panel shows the fragment of the FDC (framed in the left panel) where 3 states coexist. Red curve shows the raw data and black curve shows the data filtered at 1 Hz. (b) Red curve shows the force vs. time of the previous fragment where the transitions between these 3 states can be observed. The blue lines indicate the average forces corresponding to each of these 3 states.

In general, the hopping frequency between coexistent states is around $ \sim10-50$ Hz and the area of coexistence extends over 40 nm of distance (see Fig. K.2b). At a pulling rate of 10 nm/s, we can measure around $ 10-40$ transitions, which in most cases is sufficient to obtain a good estimation of the FDC after averaging out the raw data at a bandwidth of 1 Hz.

Figure K.3a shows the free energy landscape of one molecule at different fixed distances, which is given by $ G(x_\mathrm{tot},n)$ in Eq. 3.22. A detailed view of the free energy landscape (see Fig. K.3b) shows that it is a rough function and its coarse grained shape is parabolic. It means that for each value of $ x_\mathrm {tot}$ there is always a state of minimum global energy surrounded by other states of higher free energies (see Fig. K.3b). Although there are lots of states in the phase space, in the experiments we only observe those states that differ in free energy by less than $ \sim5$ $ k_BT$ with respect to the state of minimum free energy. So the hopping transitions described in Fig. K.2b are between states that have similar free energies. Outside this range of free energies, the higher energetic states are rarely observed and their contribution to the equilibrium FDC is negligible.

Figure K.3: Free energy landscape for the 2.2 kbp sequence at fixed distance. (a) The parabolic-like shape of the free energy landscape around the minima can be identified in a coarse grained view of what in truth is a rough landscape (see zoomed part of the landscape). Black, orange, green, blue, yellow, magenta and red curves show the free energy landscape at $ x_\mathrm {tot}=0$, 350, 500, 750, 1000, 1250 and 1455 nm, respectively. (b) Zoomed region of the free energy landscape at the distance in which the 3 states of Fig. K.2 coexist. The blue arrows indicate the minima that correspond to these states. The highlighted gray area shows an energy range of 5 $ k_BT$.

Summing up, the temperature of the experiment (so low that the breathing is neglegible), the slow pulling rate and the shape of the free energy landscape ensure us that we explore higher energetic states (within a range of $ \sim5$ $ k_BT$ with respect to the global minimum) during the unzipping process. Therefore, the experimental FDC filtered at 1 Hz is a good estimation of the equilibrium FDC.

JM Huguet 2014-02-12