4.3 Comparison of CUR size distributions and discussion

At this point, we want to compare the CUR size distributions obtained from the unzipping experiments with the ones obtained with the toy model. The goal is to see if the toy model is capable of reproducing the statistical properties of the unzipping mechanism, search the causes of the differences and use the model to determine the best experimental conditions to extract information from the DNA unzipping.

How much can the toy model predict the experimental results? The experimentally obtained CUR size distributions for both molecular constructs are shown in red in Fig. 4.14. The fit of these distributions to Eq. 4.10 are also shown in green in Fig. 4.14. Considering $ k=60$ pN$ \cdot \mu $m$ ^{-1}$ (equal to the stiffness of the trap that we can measure independently) and $ d=0.59$ nm (interphosphate distance for ssDNA), the parameters that best fit the experimental histograms for the 2.2 kbp sequence and their corresponding toy model parameters are

$\displaystyle \left.
 \begin{array}{rcl}
 A&=&0.058 \\ 
 B&=&0.42 \\ 
 C&=&2.95...
... \\ 
 \sigma & = & 2.2\mathrm{~kcal}\cdot\textrm{mol}^{-1} 
 \end{array}\right.$ (4.12)

The resulting CUR size distribution $ P(n)$ is shown in green in Fig. 4.14a. For the 6.8 kbp sequence we find

$\displaystyle \left.
 \begin{array}{rcl}
 A&=&0.050 \\ 
 B&=&0.43 \\ 
 C&=&3.0 ...
... \\ 
 \sigma & = & 3.3\mathrm{~kcal}\cdot\textrm{mol}^{-1} 
 \end{array}\right.$ (4.13)

and the fit is shown in green in Fig. 4.14b. The values of $ \mu $ and $ \sigma $ are not far from the actual mean and standard deviation of the energies of the nearest neighbor model for DNA,

\begin{displaymath}\begin{array}{rcl}
 \mu & = & -1.60\mathrm{~kcal}\cdot\textrm...
...a & = & 0.44\mathrm{~kcal}\cdot\textrm{mol}^{-1}~.
 \end{array}\end{displaymath} (4.14)

Having not included the elastic effects of the ssDNA in the toy model we should not expect a good match between the fitting and the experimental values.

Figure 4.14: (a) Distribution of CUR sizes for the 2.2 kbp sequence. Red curve shows the experimentally measured distribution. Green curve shows the distribution predicted by the toy model and the shaded area shows the standard deviation from different sequence realizations of the same length. Blue curve shows the distribution predicted by the mesoscopic model for DNA. (b) Distribution of CUR sizes for the 6.8 kbp sequence. Same color code as in panel a. (Inset of d) Threshold size $ n^{\rm thr}$ as a function of the number of open bps $ n$. The dashed line is a linear fit, $ n^{\rm thr}=9.1+0.01n$.
\includegraphics[width=\textwidth]{figs/chapter4/comparison.eps}

However, there are two clear differences between the experimental and the predicted CUR size distributions. First, the experimental size distributions are not smooth but have a rough shape. We already know that this is a finite size effect described in Sec. 4.2.5. The distribution is smoother for the 6.8 kbp sequence because the sequence is longer, there is more statistics and the resulting CUR size distribution is better averaged. The second difference is that the toy model predicts a large fraction of CURs of size smaller than 10 bp that are not experimentally observed. There might be two explanations to this: 1) the toy model predicts small CURs that experimentally do not exist or 2) the method to detect metastable states is not capable of discriminating CURs smaller than 10 bps.

To better understand this, we can compute the CUR size distributions (depicted in blue in Fig. 4.14) with the mesoscopic model described in Sec. 3.4.1. Again, we find that the model predicts much more small CURs than we experimentally observe. Assuming that the model is correct, we conclude that the small CURs occur but the method of analysis has a limiting resolution of about $ \sim$10 bp. In other words, for every large CUR detected experimentally, the model predicts two (or more) small distributions. This limitation is due to the fact that the Bayesian analysis (Sec. 4.1) is not capable of distinguishing between force fluctuations and transitions between metastable states separated by less than 10 bp. A priori, it should be possible to do the pulling experiments at lower pulling rates and collect much more statistics. This would permit to have a better signal-to-noise ratio and discriminate the smaller metastable states. However, these experiments are much more difficult to carry out because the DNA molecule spends more time stretched and it breaks much more frequently before a whole pulling cycle can be completed.

Apart from these previous considerations, there is another issue that affects the discrimination of nearby metastable states. A quick look at Fig. 4.7 shows that histograms become smoother (i.e., the peaks are less sharp) as the molecule is progressively unzipped. The increased compliance of the molecular setup as ssDNA is released markedly decreases the resolution in discriminating intermediates (see the last paragraph of section 4.1.2 for a detailed explanation of this effect). In particular, for the 6.8 kbp construct we find that along the first 1500 bp of the hairpin only 30% of the total number of CURs smaller than 10 bp are detected, whereas beyond that limit no CUR smaller than that size is detected. If the threshold size $ n^{\rm thr}$ is defined as the size of the CUR above which $ 50\%$ of the predicted CUR are experimentally detected we find that $ n^{\rm thr}$ increases linearly with the number of open bps, establishing a limit around 10 bp for the smallest CUR size that we can detect (Fig. 4.14b, inset).

Now let us focus on the other side of the distribution (large CUR sizes). The three CUR size distributions in Fig. 4.14 are long tailed distributions, which indicate that large CURs occur with finite probability. Unfortunately, large sized CUR hinder their internal DNA sequence, limiting the possibility of unzipping one base-pair at a time, which would permit to sequence the DNA. Under what experimental conditions is it possible to break up large sized CUR into individual bps? Only by applying local force on the opening fork (thereby avoiding the large compliance of the molecular setup) and by increasing the stiffness of the probe might be possible to shrink CUR size distributions down to a single base-pair [117]. Figs. 4.15a, 4.15b show how the CUR size distributions shrink and the largest CUR size decreases as the stiffness increases. Its value should be around 50-100 pN$ \cdot $nm$ ^{-1}$ for all CUR sizes to collapse into a single bp. Remarkably enough this number is close to the stiffness value expected for an individual DNA nucleotide stretched at the unzipping force. Any probe more rigid than that will not do better.

Similarly to the problem of atomic friction we can define a parameter $ \eta$ (defined as the ratio between the rigidities of substrate and cantilever) that controls the transition from stick-slip to continuous motion [127]. For DNA unzipping we have

$\displaystyle \eta=\frac{\vert\mu\vert}{k d^2}$ (4.15)

where $ \mu $ is the average free energy of formation of a single bp, $ k$ is the probe stiffness and $ d$ is the interphosphate distance. The value $ \eta=1$ determines the boundary where all CURs are of size equal to one bp ($ \eta<1$). In our experiments we have $ \eta \simeq 500$ and to reach the boundary limit $ \eta=1$ we should have $ k\sim$ 100 pN$ \cdot $nm$ ^{-1}$ consistently with what is shown in Figs 4.15a, 4.15d. Interestingly enough, the boundary limit of such stiffness is very similar to the expected stiffness of one ssDNA base. The stiffness of one nucleotide is the derivative of the Force vs. Extension Curve (Eq. 3.20) for the Freely Jointed Chain (FJC) model (see Appendix F), which is given by:

$\displaystyle k_s(f)=\left[ n\cdot d\left(-\frac{b}{k_BT}\textrm{cosech}^2\left(\frac{bf}{k_BT}\right)+\frac{k_BT}{bf^2}\right)\right]^{-1}$ (4.16)

where $ k_s(f)$ is the stiffness at force $ f$, $ n$ is the number of open base-pairs, $ d$ is the interphosphate distance, $ k_B$ is the Boltzmann constant, $ T$ is the temperature and $ b$ is the Kuhn length. Using the following values for the parameters: $ b=1.2$ nm, $ d=0.59$ nm, $ k_BT$=4.11 pN$ \cdot $nm and applying Eq. 4.16 to one single nucleotide ($ n=1$) we get a stiffness of $ k_s=113$ pN/nm at $ f=15$ pN and $ k_s=127$ pN/nm at $ f=16$ pN. It is remarkable that the elastic properties of ssDNA lie just at the boundary to allow for one bp discrimination. This suggests that the ssDNA has the correct elastic properties so that the DNA machinery can read the base-pairs one at a time, without having to expose hundreds of base-pairs to the solvent, thereby reducing the risk of damage.

Figure 4.15: (a) CUR size distributions in log-log scale for some values of $ k$ using the toy model. Data plotted with points shows the CUR size distribution for the 6.8 kbp sequence. Data plotted with lines, shows the average CUR size distribution over $ 10^4$ realizations ($ k$=60 pN$ \cdot \mu $m$ ^{-1}$, $ d=0.59$ nm, $ \mu $=-1.6 kcal$ \cdot $mol$ ^{-1}$, $ \sigma $=0.5 kcal$ \cdot $mol$ ^{-1}$). (b) The fit of the average CUR size distributions in panel a to Eq. 4.10 give the cutoff size $ n_c$. It decreases like $ n_c\simeq k^{-2/3}$. Blue curve shows $ n_c$ vs. $ k$. Red curve shows the maximum CUR size ($ n_{max}$) predicted by the toy model for the 6.8 kbp sequence. For $ k>100$ pN$ \cdot $nm$ ^{-1}$, both curves level off to CUR sizes of 1 bp.
\includegraphics[width=\textwidth]{figs/chapter4/limitingCURs.eps}

JM Huguet 2014-02-12