4.1.2 Statistical inference of intermediate states

The association of each experimental point with its most probable intermediate state is repeated for all the points along the unzipping FDC (see Fig. 4.3a). The result is a list of numbers $ \{n^*\}$ that indicate all the metastable states through which the molecule passes during the process of unzipping.

A histogram built from all values $ n^*$ results in a series of sharp peaks that can be identified with the many intermediate states $ I_n$ (see Fig. 4.3b). The histogram contains information about the stability of the intermediate states: the higher the peak, the higher the stability of that state. It has already been shown that the more stable states are related with higher GC content in the sequence [102].

The histogram can be fit to a sum of Gaussians each one characterized by its mean, variance and statistical weight (see Fig. 4.3c). The mean of each Gaussian indicates the number of open base pairs of that intermediate state. Other conditions such as the released ssDNA and the stability of the state determine the variance and the weight of the Gaussian. In general, a Gaussian distribution is sufficient to fit one peak. However, some peaks require the contribution of two or more Gaussians to be correctly fit. So this method allows to distinguish intermediate states that have similar number of open base pairs (about $ \simeq 5$-10 base pairs at the beginning of the unzipping).

Note that as the unzipping goes on, the peaks of the histogram look smoother and it is more difficult to differentiate intermediate states. This is due to the release of ssDNA as the DNA molecule is being unzipped. Indeed, the changes in the unzipping fork must be transmitted to the optical trap in order to be detected. So the opening of a CUR decreases the tension along the molecular construct and the optical trap detects a drop of force. A stiff tether transmits the force towards the optical trap better than a compliant one. The reason is that a compliant connection can absorb the elastic energy released in the unzipping without hardly changing its extension. Therefore, as the ssDNA is released during unzipping, the amplitude of the sawtooth pattern decreases, the force signal is blurred and the histogram of intermediate states shows smoother peaks.

Figure 4.3: Histogram of intermediate states. (a) Classification of points. Blue trace shows the experimental FDC. Red trace shows the number of open base pairs $ n^*$ corresponding to each experimental data point ($ y$-axis of this curve is shown in panel b). Although the curve is noisy, the plateaus indicate metastable states. (b) Histogram of the values for $ n^*$ shown in panel a. (c) Detailed view of the histogram (orange curve) overlapped with the fit to a sum of Gaussians (cyan curve). (d) Detection of a CUR of size 87 bp from the distance between two consecutive Gaussians centered at 1741 and 1828 open base-pairs.
\includegraphics[width=\textwidth]{figs/chapter4/histograms2.eps}

JM Huguet 2014-02-12