## page was renamed from AJMGrpOnly/teaching/electronic-structure/fitting-1d-pes-python
## page was renamed from ajm/teaching/electronic-structure/fitting-1d-pes-python

<<TableOfContents()>>

= Fitting a 1-D PES using Python =

Data set:

[[attachment:Ar2_MP2.txt]]
  
{{{

# Ar2 MP2 interaction energies
# Units: R : Angstrom
#        E : cm^{-1}
#       
#                                        --------Using mid-bonds---
#   R     aDZ-noCP  aTZ-noCP  aQZ-noCP   aDZ-CP    aTZ-CP    aQZ-CP  
  3.000    849.203  683.451   617.179    693.597   646.098  620.624
  3.250    205.203  106.749    81.093    114.204    94.645   82.514
  3.500    -14.944  -75.276   -82.840    -66.813   -76.199  -81.462
  3.650    -63.415 -105.688  -109.144    -97.067  -104.068 -107.120
  3.750    -78.828 -110.712  -112.906   -102.235  -108.261 -110.366
  3.850    -85.975 -108.984  -110.509   -100.761  -106.027 -107.494
  4.000    -87.113  -99.871  -100.789    -92.066   -96.351  -97.263
  4.250    -76.570  -79.596   -79.416    -72.337   -75.132  -75.668
  4.500    -61.796  -60.937   -59.607    -54.213   -55.757  -56.110
  4.750    -48.339  -45.960   -44.074    -39.975   -40.728  -40.908
  5.000    -37.453  -34.430   -32.644    -29.365   -29.747  -29.781
  5.500    -22.077  -19.352   -18.364    -16.188   -16.262  -16.180
  6.000    -12.663  -11.106   -10.662     -9.304    -9.278   -9.233
}}}


[[attachment:fit_curve_v1.py]]
  
{{{
#!/usr/bin/python3.5

import math
import matplotlib.pyplot as plt
import numpy as np
import scipy.optimize as opt

unit_bohr2angstrom = 0.529177249
unit_hartree2cm    = 219474.631371

def func_LJ(x,a,c):
    # The Lennard-Jones model
    return a/x**12 - c/x**6

def func_BM(x,a,b,c):
    # The Born-Mayer model
    #return a*np.exp(-b*x) - c/x**6
    return np.exp(-b*(x-a)) - c/x**6

data = np.transpose(np.loadtxt('Ar2_MP2.txt'))
rvals = data[0]/unit_bohr2angstrom # convert from Angstrom to Bohr
evals = data[1]/unit_hartree2cm    # convert from cm-1 to Hartree


# Standard deviations all set to 1.0
sigma = sigma = np.full(rvals.size,1.0)

print("sigma array: \n")
print(sigma)

# LJ fit
# =
# Starting values for parameters
x0=np.array([1.0,1.0])
# Do the fitting:
popt, pcov = opt.curve_fit(func_LJ,rvals,evals,p0=x0,sigma=sigma)
#
# popt : optimized parameters
# pcov : covariance matrix
print("\n\n")
print(" a = %12.4f  c = %12.4f  \n" % (popt[0],popt[1]))
print("Covariance matrix: \n")
ndim = pcov.shape[0]
for i in range(0,ndim):
    for j in range(0,ndim):
        pcov[i][j] = pcov[i][j]/np.sqrt(pcov[i][i]*pcov[j][j])
print(pcov)
print("\n\n")
#
# Plot
plt.plot(rvals,evals,'b-', label='data',marker='o')
plt.plot(rvals, func_LJ(rvals, *popt), 'g--', label='fit')
plt.show()

# BM fit
# =
# Starting values for parameters
x0=np.array([1.0,1.0,1.0])
# Do the fitting:
popt, pcov = opt.curve_fit(func_BM,rvals,evals,p0=x0,sigma=sigma)
#
# popt : optimized parameters
# pcov : covariance matrix
print("\n\n")
print(" a = %12.4f  b = %12.4f  c = %12.4f  \n" % (popt[0],popt[1],popt[2]))
print("Covariance matrix: \n")
ndim = pcov.shape[0]
for i in range(0,ndim):
    for j in range(0,ndim):
        pcov[i][j] = pcov[i][j]/np.sqrt(pcov[i][i]*pcov[j][j])
print(pcov)
print("\n\n")
#
# Plot
plt.plot(rvals,evals,'b-', label='data',marker='o')
plt.plot(rvals, func_BM(rvals, *popt), 'g--', label='fit')
plt.show()


# Now try a weighting of exp(-(e-e0)/kT)
e0 =   80.0/unit_hartree2cm  # both in units of evals
kT =  400.0/unit_hartree2cm
#sigma = np.exp(-np.abs(evals-e0)/kT)
sigma = np.exp(-np.abs(evals/e0))

print("sigma array: \n")
print(sigma)

# LJ fit
# =
# Starting values for parameters
x0=np.array([1.0,1.0])
# and re-fit
popt, pcov = opt.curve_fit(func_LJ,rvals,evals,p0=x0,sigma=sigma)
#
# popt : optimized parameters
# pcov : covariance matrix
print("\n\n")
print(" a = %12.4f  c = %12.4f  \n" % (popt[0],popt[1]))
print("Covariance matrix: \n")
ndim = pcov.shape[0]
for i in range(0,ndim):
    for j in range(0,ndim):
        pcov[i][j] = pcov[i][j]/np.sqrt(pcov[i][i]*pcov[j][j])
print(pcov)
print("\n\n")
#
# Plot
plt.plot(rvals,evals,'b-', label='data',marker='o')
plt.plot(rvals, func_LJ(rvals, *popt), 'g--', label='fit')
plt.show()

}}}

== Covarianc ==
The fitting function **curve_fit** writes out not only the solution, but also the //covariance matrix//. This tells us how the parameters are correlated. To find out more about this [[http://www.statisticshowto.com/covariance/|have a look at this site.]] Here are a few examples of the //correlation matrix// which is defined in terms of the covariance matrix as follows:
$$
  {\rm Corr}(x,y) = {\rm Cov}(x,y)/(\sigma_x \sigma_y)
$$
where the standard deviations are themselves part of the covariance matrix:
$$
  \sigma_x = \sqrt{{\rm Cov}(x,x)}.
$$

Here's the correlation matrix for a fit to the Lennard-Jones function form:
$$
  V_{\rm LJ}(r) = \frac{A}{r^{12}} - \frac{C}{r^6}
$$
{{{
Correlation matrix: 

[[ 1.         0.9299438]
 [ 0.9299438  1.       ]]

}}}
Ignore the $1.0$ terms on the diagonal. We are interested in the off-diagonal terms. For parameters that are not correlated these will be zero. Here we have off-diagonal terms nearly $+1$, so we conclude that the two parameters in the LJ function are heavily correlated, and will tend to change in the same way.

Conversely, here is the correlation matrix for the Born-Mayer function defined as:
$$
  V_{\rm BM}(r) = \exp{(-\alpha(r-r_0))} - \frac{C}{r^6},
$$
with three free parameters: $\alpha$, $r_0$ and $C$.
{{{
Correlation matrix: 

[[ 1.          0.9904921  -0.87938874]
 [ 0.9904921   1.         -0.93439446]
 [-0.87938874 -0.93439446  1.        ]]

}}}
Here we see a mixed picture: $\alpha$ is heavily correlated with $r_0$. On the other hand, the correlation with $C$ is slightly weaker.