## 4.1.1 Bayesian analysis of FDCs

The classification of the experimental data into intermediate states is based on the elastic response of the system (the molecular construct plus the bead in the optical trap), which is given by the mesoscopic model described in Section 3.4. So each experimental data point (distance,force) is associated to the most probable intermediate state that is compatible with the elastic properties of the system. This way, each point belongs to a state.

For convenience, here we reproduce the expression (Eq. 3.23) that determines the elastic response of system when the DNA molecule is at the intermediate state where bases are open, (4.1)

where is the total distance of the system, is the position of the bead, is the extension of one handle and is the extension of one strand of ssDNA. All these terms depend on the force applied at the ends of the system. Note that this equation is a FDC defined for the intermediate state .

Now, we will use Eq. 4.1 differently from Sec. 3.4. Here we consider the partial elastic response of a system with fixed number of open base pairs, so that the energetic contribution of the base pairs is irrelevant. Therefore, Eq. 4.1 can be understood as the expression of a family of curves passing through the coordinates origin and characterized by a parameter . In unzipping experiments, the number of open base pairs varies while the rest of the properties of the system remain unchanged (trap stiffness, elasticity of handles). So, as the molecule is being unzipped, we observe fragments of these curves (slopes) connected by force rips. In other words, an unzipping FDC is a piecewise-defined function of Eq. 4.1.

Since is what changes the elastic properties of the system, it can be used to classify the experimental points into states. For each experimental data point along the FDC, ( ), the intermediate state that passes closest to that point for a fixed force is determined by (4.2)

where is given by Eq. 4.1. The function min ensures that is the state that has the FDC that passes closest to the experimental point. In this way, each experimental data point ( ) is associated to a unique value of (see Fig. 4.2). Note that the experimental points have force fluctuations but Eq. 4.2 does not. It means that some points are incorrectly classified because force fluctuations are confused with different values of . However, the overall outcome of the Bayesian approach shows its usefulness after all the experimental points are classified. The use of Eqs. 4.1 and 4.2 to classify the experimental points requires the determination of some elastic parameters from the experimental data. Most of them are obtained by fitting the elastic response of the fully extended molecule (see yellow curve in Fig. 4.1a) to Eq. 4.1, because we know that the number of open base pairs is in this part of the FDC. The stiffness of the trap - pN m is determined by performing measurements of the force vs. displacement when a bead is held fixed at the tip of the micropipette (see Fig. 2.15d). The parameters of the handles are taken from the established elastic properties of long dsDNA molecule : nm and nm, where is the number of base pairs of the handles and nm is the interphosphate distance. Finally, the elastic parameters of the ssDNA are obtained by least-square non-linear fitting. The Kuhn length of the ssDNA is found to be nm (see Table 5.1). Among all the elements of the experimental setup, the dsDNA handles are those that do not modify appreciably the FDCs due to their large rigidity. Since the measurement of distances in the instrument is relative, it is also necessary to fit a global shift in Eq. 4.1 that determines the zero of distance.

JM Huguet 2014-02-12