| Size: 30860 Comment:  | Size: 30851 Comment:  | 
| Deletions are marked like this. | Additions are marked like this. | 
| Line 17: | Line 17: | 
| - [[ajm:camcasp:potentials|Potential Development using CamCASP]] - [[ajm:camcasp:charge-transfer|Charge-Transfer]] - [[ajm:camcasp:polarization|Polarization & Polarization Models]] - [[ajm:camcasp:potentials:rory|Rory's notes on the water potential]] - [[ajm:orient:energy-calculations|Orient: Energy Calculations]] | 1. [[ajm/camcasp/potentials|Potential Development using CamCASP]] 1. [[ajm/camcasp/charge-transfer|Charge-Transfer]] 1. [[ajm/camcasp/polarization|Polarization & Polarization Models]] 1. [[ajm/camcasp/potentials/rory|Rory's notes on the water potential]] 1. [[ajm/orient/energy-calculations|Orient: Energy Calculations]] | 
| Line 26: | Line 26: | 
| - Optimize the molecule geometries. You will have to start from a fixed geometry before trying to include flexibility (current research). - Determine the vertical ionization energies of the molecules, and, if you are using NWChem or DALTON2016, also determine the AC-SHIFT for these systems. - Molecular Properties: The potential is created out of a short-range and long-range part. The short-range part is what we will do here. The long-range part is based on molecular properties calculated on the isolated molecules. These properties are: - Distributed multipoles - Distributed polarizabilities - Distributed dispersion coefficients - Interaction energy calculations for a set of pseudo-random dimer configurations. This is a time-consuming step. - First-order energy calculations on a dense set of pseudo-random dimer configurations. If the above set of configurations is already fairly dense, this step may be skipped. - Density overlap calculations on the dense set of dimer configurations used for the first-order energy calculations. - | 1. Optimize the molecule geometries. You will have to start from a fixed geometry before trying to include flexibility (current research). 1. Determine the vertical ionization energies of the molecules, and, if you are using NWChem or DALTON2016, also determine the AC-SHIFT for these systems. 1. Molecular Properties: The potential is created out of a short-range and long-range part. The short-range part is what we will do here. The long-range part is based on molecular properties calculated on the isolated molecules. These properties are: 1. Distributed multipoles 1. Distributed polarizabilities 1. Distributed dispersion coefficients 1. Interaction energy calculations for a set of pseudo-random dimer configurations. This is a time-consuming step. 1. First-order energy calculations on a dense set of pseudo-random dimer configurations. If the above set of configurations is already fairly dense, this step may be skipped. 1. Density overlap calculations on the dense set of dimer configurations used for the first-order energy calculations. 1. | 
| Line 106: | Line 106: | 
| I have already written a tutorial on [[ajm:camcasp:pols-cn|Molecular Properties]] so use that to perform this step. | I have already written a tutorial on [[ajm/camcasppols-cn|Molecular Properties]] so use that to perform this step. | 
| Line 134: | Line 134: | 
| * Distributed density overlap calculations for all dimer configurations. Water example: [[ajm:camcasp:examples:fits:water2-ovr-partial|ovr-H2O2-daTZ-tzvpp-GELSS-2048.dat]] * First-order energies: All three energies needed to calculate $E^{(1)}_{\rm sr}$. Water example: [[ajm:camcasp:examples:fits:water2-Eint-aTZ-partial|Eint-H2O2-aTZ-2048-camcasp.dat]] | * Distributed density overlap calculations for all dimer configurations. Water example: [[ajm/camcaspexamples:fits:water2-ovr-partial|ovr-H2O2-daTZ-tzvpp-GELSS-2048.dat]] * First-order energies: All three energies needed to calculate $E^{(1)}_{\rm sr}$. Water example: [[ajm/camcaspexamples:fits:water2-Eint-aTZ-partial|Eint-H2O2-aTZ-2048-camcasp.dat]] | 
| Line 137: | Line 137: | 
| * CamCASP input file for the overlap model. Water example: [[ajm:camcasp:examples:fits:water2-ovrmodel.cks|H2O2-ovrmodel.cks]] | * CamCASP input file for the overlap model. Water example: [[ajm/camcaspexamples:fits:water2-ovrmodel.cks|H2O2-ovrmodel.cks]] | 
| Line 225: | Line 225: | 
| - The basis set we had used to calculate the distributed overlaps was small. This was deliberate so as to avoid linear-dependencies. - The other reason is that the overlap model was meant to work for $E^{(1)}_{\rm exch}$ only. There is no reason to expect it to work for the penetration energy too. - Besides, we have not even damped the electrostatic multipole model. | 1. The basis set we had used to calculate the distributed overlaps was small. This was deliberate so as to avoid linear-dependencies. 1. The other reason is that the overlap model was meant to work for $E^{(1)}_{\rm exch}$ only. There is no reason to expect it to work for the penetration energy too. 1. Besides, we have not even damped the electrostatic multipole model. | 
| Line 323: | Line 323: | 
| - The fit to first-order energies using a dense coverage of data allowed us to fit reliable parameters, in particular, the atomic shape-functions should be reasonably well-defined. - We may not have enough data at second- and higher-order to allow all these parameters to relax. So we now need to be careful and use anchors in the relaxation process. Appropriate anchor weights will need to be found by experimentation, but as a rule, the shape-functions should probably be help tightly to the values obtained from the first-order fit. | 1. The fit to first-order energies using a dense coverage of data allowed us to fit reliable parameters, in particular, the atomic shape-functions should be reasonably well-defined. 1. We may not have enough data at second- and higher-order to allow all these parameters to relax. So we now need to be careful and use anchors in the relaxation process. Appropriate anchor weights will need to be found by experimentation, but as a rule, the shape-functions should probably be help tightly to the values obtained from the first-order fit. | 
| Line 330: | Line 330: | 
| Here $E^{(1)}_{\rm sr}$ is the first-order short-range energy defined above, ${\rm CT}^{(2-\infty)}$ is the infinite-order charge-transfer energy defined on the [[ajm:camcasp:charge-transfer|Charge-Transfer]] and [[ajm:camcasp:polarization|Polarization]] pages, and $\Delta_{\rm disp}$ is a residual error in the dispersion model that is defined as: | Here $E^{(1)}_{\rm sr}$ is the first-order short-range energy defined above, ${\rm CT}^{(2-\infty)}$ is the infinite-order charge-transfer energy defined on the [[ajm/camcaspcharge-transfer|Charge-Transfer]] and [[ajm/camcasppolarization|Polarization]] pages, and $\Delta_{\rm disp}$ is a residual error in the dispersion model that is defined as: | 
| Line 357: | Line 357: | 
| To do this, we first need to calculate ${\rm Pol}^{(2-\infty)}$ for all dimer configurations. Instructions for this can be found on the [[ajm:orient:energy-calculations|Orient: Energy Calculations]] and [[ajm:camcasp:polarization|Polarization & Polarization Models]] pages. Next, we insert these energies into the [[ajm:camcasp:examples:fits:water2-eint-atz-partial|Energy-file]]. In our water example, this file contains a column headed //Eind-MP//. The infinite-order polarization energies need to go under this column. (In older versions of this file, the column may be titled //E2ind-Asymp//.) Once the ${\rm Pol}^{(2-\infty)}$ energies are included in the energy file, we use **CamCASP** to create the **Orient** input files using the following modification of [[ajm:camcasp:examples:fits:water2-ovrmodel.cks|water dimer : CamCASP Overlap Model input file]] | To do this, we first need to calculate ${\rm Pol}^{(2-\infty)}$ for all dimer configurations. Instructions for this can be found on the [[ajm:orient:energy-calculations|Orient: Energy Calculations]] and [[ajm/camcasppolarization|Polarization & Polarization Models]] pages. Next, we insert these energies into the [[ajm/camcaspexamples:fits:water2-eint-atz-partial|Energy-file]]. In our water example, this file contains a column headed //Eind-MP//. The infinite-order polarization energies need to go under this column. (In older versions of this file, the column may be titled //E2ind-Asymp//.) Once the ${\rm Pol}^{(2-\infty)}$ energies are included in the energy file, we use **CamCASP** to create the **Orient** input files using the following modification of [[ajm/camcaspexamples:fits:water2-ovrmodel.cks|water dimer : CamCASP Overlap Model input file]] | 
| Line 403: | Line 403: | 
| - Check to see that the energies included are indeed the energies you require. Best to check a couple of cases by hand. - Include the electrostatic model (the DMA moments). - **Do not include a polarization model.** You have already subtracted out this energy. - Include your dispersion model in the **Pairs** section. Also don't forget the dispersion damping parameters. - Include the short-range potential obtained from the first-order fit. It is possible to merge the short-range and dispersion potentials into one file using the **Cluster** program. See the **CamCASP** manual for details. - Check the variables in the **Adjust** block. These should be the ones you want to relax. Comment out variables you wish to keep fixed. You could also use very tight anchors to fix them. - Include an **anchors** file. You can create one from the potential file using the **anchors.pl** script. This script has a help page. - Set Boltzmann weights. This line may need to be un-commented. - Something like //Weights Boltzmann $N_w \times |E_0|$ Above $E_0$// should do. - Run **Orient** and see what you get. - You will need to play with the anchor values and $N_w$ to get a decent fit. | 1. Check to see that the energies included are indeed the energies you require. Best to check a couple of cases by hand. 1. Include the electrostatic model (the DMA moments). 1. **Do not include a polarization model.** You have already subtracted out this energy. 1. Include your dispersion model in the **Pairs** section. Also don't forget the dispersion damping parameters. 1. Include the short-range potential obtained from the first-order fit. It is possible to merge the short-range and dispersion potentials into one file using the **Cluster** program. See the **CamCASP** manual for details. 1. Check the variables in the **Adjust** block. These should be the ones you want to relax. Comment out variables you wish to keep fixed. You could also use very tight anchors to fix them. 1. Include an **anchors** file. You can create one from the potential file using the **anchors.pl** script. This script has a help page. 1. Set Boltzmann weights. This line may need to be un-commented. 1. Something like //Weights Boltzmann $N_w \times |E_0|$ Above $E_0$// should do. 1. Run **Orient** and see what you get. 1. You will need to play with the anchor values and $N_w$ to get a decent fit. | 
Contents
Navigation:
Potential Development using CamCASP
Warning
This tutorial is being updated to reflect the new methodology we have published in our recent paper in JCTC (2016) on the pyridine dimer.
See also:
Or, potential development made easier!!!
There are a few things you need to do before getting here:
- Optimize the molecule geometries. You will have to start from a fixed geometry before trying to include flexibility (current research).
- Determine the vertical ionization energies of the molecules, and, if you are using NWChem or DALTON2016, also determine the AC-SHIFT for these systems.
- Molecular Properties: The potential is created out of a short-range and long-range part. The short-range part is what we will do here. The long-range part is based on molecular properties calculated on the isolated molecules. These properties are: - Distributed multipoles
- Distributed polarizabilities
- Distributed dispersion coefficients
 
- Interaction energy calculations for a set of pseudo-random dimer configurations. This is a time-consuming step.
- First-order energy calculations on a dense set of pseudo-random dimer configurations. If the above set of configurations is already fairly dense, this step may be skipped. - Density overlap calculations on the dense set of dimer configurations used for the first-order energy calculations.
 
The potential we are after has the form: $$ V{ab}(r_{ab}) = K \exp{[-\alpha{ab}(r_{ab}-\rho{ab}(\Omega{ab}))]}
- + \sum_{n=6}{12} f_{n}(\beta r_{ab}) \frac{C{ab}_{n}}{r_{ab}^n} + \{ \mathcal{Q}{a}_{lm}, \alpha{ab}_{lm,l'm'} \} 
$$ The first term is the short-ranged term that needs to be fitted. This is what we will be concerned with here. The second is the dispersion model. In principle, all parameters in this model are theoretically derived. The last term is an abbreviated way of including the terms that depend on multipole moments (the electrostatic term) and those that depend on multipole-moments and polarizabilities (the polarization term). The electrostatic term is obtained directly from the distributed multipole moments and no fitting is required. The polarization model requires a damping function (different from the damping used for the dispersion). This needs to be fitted.
Notice that almost all terms are site-site-dependent. The exceptions are the damping parameters $\beta$ in the dispersion/polarization models. Though, in principle, these too should be dependent on the sites involved.
Initial geometries: Geometry optimization
A few pointers:
- If you wish to use optimized molecular geometries, then perform the optimization using - PBE0. This is usually good. If your system has a small HOMO-LUMO gap then you will need to consider alternatives such as LC-PBE/LC-PBE0.
- either the cc-pVDZ or cc-pVTZ basis sets. Do not use augmentation unless you think it is essential.
 
- If you have an experimental geometry then use it.
- If you have a lot of hydrogens, consider lengthening the X-H bond lengths by 10% to take quantum effects into account. The exact elongation needs to be checked from papers by Szalewicz and Jankowski. Basically what this means is that rather than use the equilibrium geometry, $R_e$, you will be using $R_0$. Do this step only if you have understood the literature! 
Ionization energy and AC-SHIFT
Eventually we will use LC-PBE/LC-PBE0 for SAPT(DFT) calculations, but till then, we will continue to use an asymptotically corrected hybrid functional like PBE0.
Note
Click here for an introduction to the asymptotic correction, but bear in mind that the Fermi-Amaldi correction that I talk about on that page is not the one we now recommend. We now think that the GRLB correction is best. More on this some other time.
For the asymptotic correction you need to know the shift needed to correct the XC potential. This shift is defined as $$ \Delta = {\rm IP} + \epsilon_{\rm HOMO}. $$ So you need two quantities to define the shift:
- The vertical ionization energy, and
- the energy of the HOMO eigenvalue.
- NOTE: all quantities must be in atomic units!
Ionization potential
First see if there is an experimental IP available from the NIST Chemistry WebbookExternal Link. Convert it to atomic units.
If you need to compute the IP use the $\Delta$DFT approach: $$
- {\rm IP} = E[N-1] - E[N],
$$ that is, calculate the energies of the $N$ and $N-1$ electron systems and determine the IP by subtraction. Use PBE0 or LC-PBE0 to determine these energies. Basis sets could be Sadlej-pVTZ, cc-pVDZ or cc-pVTZ. These basis sets may be augmented, but augmentation is typically needed for anions.
NWChem and DALTON2016 also need the AC-SHIFT. This is just the $\Delta$ in the equation defined above. DALTON2006 uses only the IPs and determines the shift on-the-fly. This is better but is more time-consuming. To compute the shift you need to calculate the HOMO eigenvalue //without// an asymptotic correction.
Important
For notes on using NWChem see my tutorials at:
In particular, see the following:
Of course, the one on Using Comanche only applies if you have an account there.
It may be easiest to use the Psi4 code for this calculation. Psi4 has nice tutorials on all sorts of calculations that you can find on their wedsite.
Molecular Properties
Now we need to calculate the molecular properties for the interacting molecules in their reference geometries. These properties include:
- ISA multipole moments
- ISA-Pol non-local polarizabilities
- Localized ISA-Pol polarizabilities.
- Dispersion model for the interacting molecules.
The latter is formally not in this group, but for homo-dimers is obtained as part of the localization step. For hetero-dimers you need to do another step to calculate the dispersion model.
I have already written a tutorial on Molecular Properties so use that to perform this step.
Fitting the short-range parameters
$$ V{ab}_{\rm sr}(r_{ab}) = K \exp{[-\alpha{ab}(r_{ab}-\rho{ab}(\Omega{ab}))]} $$ It is not easy to fit the sum of exponentials - in fact it is ill-conditioned. The results are typically nonsense and most non-linear fitting packages will fail. Instead of attempting (and failing) at a direct fit of these parameters, we will use the Overlap Model as an intermediate step.
First, what exactly do we mean by the short-range energy? $$
- E_{\rm sr} = E{(1)}_{\rm sr} + E{(2-\infty)}_{\rm sr} 
$$ The first-order short-range energy is defined as $$
- E{(1)}_{\rm sr} = E{(1)}_{\rm exch} + ( E^{(1)}_{\rm elst} - E_{\rm elst}({\rm DMA}) ), 
$$ that is, it is the sum of the first-order exchange(-repulsion) energy and the first-order penetration energy which is the difference of the first-order electrostatic energy and electrostatic energy calculated using the distributed multipoles.
We will fit the two parts of $E_{\rm sr}$ in stages.
Overlap Model: Fitting $E^{(1)}_{\rm sr}$
Details of this model is described the Atom-Atom review paper. So we will fill this up later. $$
- E{(1)}_{\rm sr} = \sum_{ab} K{ab} \int \rho{a}(r) \rho{b}(r) dr 
$$ our job is to find the parameters $K^{ab}$.
Data files needed:
- Distributed density overlap calculations for all dimer configurations. Water example: ovr-H2O2-daTZ-tzvpp-GELSS-2048.dat 
- First-order energies: All three energies needed to calculate $E^{(1)}_{\rm sr}$. Water example: Eint-H2O2-aTZ-2048-camcasp.dat 
- Local axis file that will be used by ORIENT. Water example: H2O.axes 
- CamCASP input file for the overlap model. Water example: H2O2-ovrmodel.cks 
(perform the Sanity-Check below if needed)
The general procedure for fitting the overlap model parameters $K^{ab}$ is as follows:
- Define the energy correctly in the CamCASP input file.
- Choose the minimum and maximum energies to be included in the fit.
- Choose the type of weight and the parameter $E_0$. 
A typical input block (excluding the molecule definitions etc) looks like:
- Begin Overlap-model - Units kJ/mol  ! All energies in these units Weight Type 4 and E0 = 10 ! Fit the first-order energies: Energy-Options Check-Asymp  Abs-Max-Energy = 500.0 Energy = E1exch + E1pen  ! Definition of energy Emin = 0.0001 Emax = 500.0 Constraint = 10e-6  ! Don't change Condition = 10e-7   ! don't change Energy-file Eint-H2O2-aTZ-2048-camcasp.dat Overlap-file ovr-H2O2-daTZ-tzvpp-GELSS-2048.dat Orient files No ! See below Symmetrize ! See below (Debug) 
 
- Units kJ/mol  ! All energies in these units Weight Type 4 and E0 = 10 ! Fit the first-order energies: Energy-Options Check-Asymp  Abs-Max-Energy = 500.0 Energy = E1exch + E1pen  ! Definition of energy Emin = 0.0001 Emax = 500.0 Constraint = 10e-6  ! Don't change Condition = 10e-7   ! don't change 
The goal is to obtain a good fit by varying the parameters Emin, Emax, the type of weight (in the above example the 4 refers to the Gauss-Log weight described in the CamCASP User's Guide) and the energy E0. A good fit would be one with a weighted r.m.s. error of around 10-15 %.
If your list of dimers contains dimers with very short contacts you may face problems with the electrostatic energies calculated with the multipoles (essentially, the $1/R$ expansion diverges and causes pathalogical energies). To eliminate such terms use the Abs-max-Energy command with a suitable value for the absolute manimum energy. As a rule, choose a value not far from what you selected for Emax.
A word about the other commands:
- Orient files No: says do not create any of the files for the ORIENT code. We will set this to Yes once we are happy with the quality of the overlap model. 
- Symmetrize: Use this when you wish to symmetrize between atom types of the two molecules. For example, the hydrogen atoms in the two water molecules are equivalent as are the two oxygen atoms. So symmetrize between these. A reminder: TYPEs of atoms are set in the MOLECULE block in the CamCASP file. 
To calculate an overlap model run CamCASP:
- $ camcasp < H2O2-ovrmodel.cks 
The output is split into three sections:
- First the code calculates a total overlap model with just one parameter $K$. 
- Then the code calculates a distributed overlap model. These parameters may not be well-defined and you may find that some of them turn out to be negative (unlikely in the water example).
- Finally, the code re-calculates the distributed overlap model, this time using the value $K$ obtained in the first step as a pinning/anchor for the parameters $K^{ab}$. This makes the model well-defined in a Bayesian sense (i.e., we used a prior). 
It is the third section we use.
The code prints out the reference energies together with fitted energies. Plot these on a scatter plot. Once you are happy with the quality of the fit we can proceed to the next step.
Sanity-check
To make sure it all works, it is best to first fit the first-order exchange only. To do this use the file H2O2-ovrmodel-e1exch.cks. The reason for this check is that it is quite easy to make mistakes calculating the electrostatic energies with the multipole models. Also, the density-overlap model is know to work best for the exchange-repulsion only. So if this fit fails, there is probably an error in the setup.
0: Fit to exponentials
Once the ovrlap model is satisfactory we need to use the results of the model to fit each of the exponential terms in the short-range potential in turn. The way this works is as follows.
The overlap model allows us to write $$
- E{(1)}_{\rm sr} = \sum_{ab} K{ab} \int \rho{a}(r) \rho{b}(r) dr = \sum_{ab} E^{(1)}_{\rm sr}(ab), 
$$ that is, we have effectively split the short-range energy into contributions arising from all pairs of sites (more correctly, all pairs of site-types). What we do now is to fit each of these contributions to a single exponential: $$
- E{(1)}_{\rm sr}(ab) = G \exp[-\alpha{ab}(r_{ab} - \rho{ab}(\Omega{ab}))] 
$$ Here $G$ is an energy constant (an exponential has no units!) that sets the unit of energy. These fits are what we will do next. Before getting there, bear in mind that the overlap model is approximate!!! So we will need to correct for the errors later on.
For these fits we need to have ORIENT input files with the partitioned energies included. This can be done for us by CamCASP by repeating the overlap model calculation (the one we were happy with), but this time using Orient file YES. This tells CamCASP to write out the ORIENT input files.
Run the water example and get the following files:
- H2OA_H2OB_total.ornt H_H.ornt O_H.ornt O_O.ornt
The first includes the total (first order) energy and the rest include the energies for the sites listed in the file names. We will now use the last three.
Local axis systems: We now have to decide on a set of local axes which ORIENT will use to define the axis system for each pair of sites. This is done in the file H2O.axes which contains the lines:
- Axes - O z between H1 and H2 x from H1 to H2 H1 z from O to H1 x from H1 to H2 H2 z from O to H2 x from H2 to H1
 
You should edit the ORIENT command files to include this local axis file. But a simple trick works. The default name of the local axis file is local_axes. So simply make a link as follows:
- $ ln -s H2O.axes local_axes
This will allow ORIENT to find the local axis files.
We can now run ORIENT. This is usually done like so:
- $ orient < H_H.ornt > H_H.ornt.out 
but we have a script that will do this for you and extract the result in an appropriate form:
will search for all files of the form *.ornt and run them through ORIENT and extract the potential and places it in a file called potential.raw. Convert this potential into a form ORIENT can use using the command:
- $ raw2ORIENT.pl potential.raw > H2O2-model0-0-1.pot 
<note>The potential may contain releated site-pairs: H..O is the same as O..H. You may want to edit it to keep only unique pairs. There is a way to get the CLUSTER program to do this. See the CamCASP User's Guide for details.}}}
Relaxing the 0-fit
As mentioned above, the overlap model is approximate. This is the case for two reasons:
- The basis set we had used to calculate the distributed overlaps was small. This was deliberate so as to avoid linear-dependencies.
- The other reason is that the overlap model was meant to work for $E^{(1)}_{\rm exch}$ only. There is no reason to expect it to work for the penetration energy too. 
- Besides, we have not even damped the electrostatic multipole model.
So we now relax the parameters we have obtained from the previous step. We do this using anchors that pin the parameter values to guesses from an earlier step. <note> This is a technique we will use quite often and probably needs to be described on a separate page!}}}
Files needed:
- ORIENT input file containing total energies (as defined in the CamCASP overlap calculation, $E^{(1)}_{\rm sr}$ in our case). 
- The potential from the previous step: H2O2-model0-0-1.pot 
- An anchors file generated from this potential (done below).
Creating the anchors file
This is done using the anchors.pl script:
- $ anchors.pl  Usage: anchors.pl [-f] <potential file> [-alpha <alpha constraint>] - [-iso <rho00 constraint>] [-aniso <anisotropy constraint>] [-C6 <constraint for C6> | -noC6 ] 
 
The potential consists of the $\alpha$ parameter (this parameter is always isotropic in our potentials so there is only one such), the $\rho(\Omega)$ parameters that can be anisotropic and, optionally, a $C_6$ parameter. More on the latter later. The anchors.pl script allows us to define a set of anchor values for each catagory of parameters:
- -alpha refers to the parameter $\alpha$ 
- -iso refers to the isotropic part of $\rho(\Omega)$ 
- -aniso refers to the anisotropic part of $\rho(\Omega)$ 
- -C6 or -noC6 refers to the $C_6$ parameters. 
For example, we can define anchors for our potential as follows:
- $ anchors.pl -noC6 H2O2-model0-0-1.pot 
This uses the default values for the anchors. The -noC6 is needed as the potential doesn't contain any $C_6$ coefficients. The output is:
- ! Anchors for H2O2-model0-0-1.pot ! C_alpha = 0.050000 ! C_iso = 0.010000 ! C_aniso = 0.001000 - aHH 1.765547 0.050000
 - rHH_0      3.671346      0.010000  rHH_1     -0.316753      0.001000  - aOH 1.922795 0.050000
 - aOO 2.154523 0.050000
 
 
We can set specific values for the various types of parameters like so:
- $ anchors.pl $ anchors.pl -noC6 -alpha 1.0 -iso 0.1 -aniso 10.0 H2O2-model0-0-1.pot ! Anchors for H2O2-model0-0-1.pot ! C_alpha = 1.000000 ! C_iso = 0.100000 ! C_aniso = 10.000000 - aHH 1.765547 1.000000
 - rHH_0      3.671346      0.100000  rHH_1     -0.316753     10.000000  - aOH 1.922795 1.000000
 - aOO 2.154523 1.000000
 
 
With these choices the anisotropic parameters in $\rho(\Omega)$ are effectively pinned to the values they had in the potential, i.e., they won't vary much. The $\alpha$ parameter will be allowed to vary a little, and the isotropic part of $\rho$ will have the potential to vary the most. Imagine the anchors to be the spring-constants pining the parameters to their starting values; the larger the anchor (spring constant) the less they will stray.
These anchors are needed when we have insufficient data to determine all parameters or when relaxing all parameters is not possible in a heavily ill-conditioned problem. This is the case here.
Your job is to figure out what is a good set of anchors to fit the data. This is not a well-defined problem, at least, not as yet. As a rule, use the tightest anchors possible, relaxing the anisotropy the least. Anchors should be in the range $10.0$ (for the tightest pinning) to $0.001$ for the weakest. Use the latter on the $\alpha$ and isotropc part of $\rho$.
Fitting all parameters using ORIENT
We need to modify the ORIENT command file created for us by CamCASP. The one we want contains the total energies: H2OA_H2OB_total.ornt for the water example. We are going to edit this so copy it to, say, H2OA_H2OB_total_Boltzmann.ornt. The Boltzmann refers to the Boltzmann weighting we will use.
Steps to edit this file:
- Edit the local_axes file names as necessary. 
- The FIT block consists of geometries, the energy and weights. We will not need the weights so delete the last column and the Weight header in this block. Use vim to do this as it's quite easy to manipulate columns. 
- Change the Pairs block to look like this: Pairs - ! Select Tang--Toennies damping functions: ! Dispersion damping Factor <BETA-D> ! Induction damping Factor <BETA-I> #include H2O2-model0-0-1.pot 
 
- ! Select Tang--Toennies damping functions: 
- Notice that we have used the #include command to include the potential in the ORIENT file. 
- Next we come to the Adjust block. This contains the parameters we want ORIENT to adjust, i.e., fit to the data. 
- This block contains a Variables section that should contains a list of variables. This list should be like the potential file H2O2-model0-0-1.pot. Compare these to make sure the list has no extra parameters. There can be fewer variables than in the potential file (you may want to fit some parameters only), but there can't be more!!! 
- Also in this block are the Anchors. We have saved the anchors to a file H2O2-model0-0-1-a.anchors (named using the potential name and a modifier -a) and included it into the ORIENT file using the #include command. 
- Finally we come to the weights: Weights Boltzmann 20.0 Above 0.0. This tells ORIENT to use Boltzmann weights. These are of the form: 
$$
- w(e) = \exp(-e/e_0)
$$
- Here $e$ is the energy and $e_0$ is a reference energy which you will need to vary. Values of $e_0$ will typically be related to the well-depth of the potential. Since the water dimer is bound by $20$ kJ/mol (approximately), a sensible value for $e_0$ could be a few times this, say from $40$ to $100$ kJ/mol. You will need to vary $e_0$ till you get a sensible fit. 
- As a rule of thumb: get a reasonably good set of anchors (for a fixed $e_0$) and then vary $e_0$. Iterate a few times till the fit looks good enough. 
- Finally, the Boltzmann line contains Above 0.0. This is used when we have negative energies. See the ORIENT manual for details. 
Imposing shape-function constraints
Fitting all short-range energies
At this stage we should have a reasonable set of parameters that have been obtained using $E^{(1)}_{\rm sr}$ calculated on a dense (pseudo-random) set of dimers. We now need to relax these parameters to include all short-range energies. Since calculating higher-order energies is computationally expensive, we will typically not have a very dense set of configurations for $E^{(1-\infty)}_{\rm sr}$. This makes parameter determination difficult or even impossible, which is why we perform the fit in stages:
- The fit to first-order energies using a dense coverage of data allowed us to fit reliable parameters, in particular, the atomic shape-functions should be reasonably well-defined.
- We may not have enough data at second- and higher-order to allow all these parameters to relax. So we now need to be careful and use anchors in the relaxation process. Appropriate anchor weights will need to be found by experimentation, but as a rule, the shape-functions should probably be help tightly to the values obtained from the first-order fit.
Definition of $E^{(1-\infty)}_{\rm sr}$
$$ E{(1-\infty)}_{\rm sr} = E{(1)}_{\rm sr} + {\rm CT}^{(2-\infty)} + \Delta_{\rm disp} $$ Here $E^{(1)}_{\rm sr}$ is the first-order short-range energy defined above, ${\rm CT}^{(2-\infty)}$ is the infinite-order charge-transfer energy defined on the Charge-Transfer and Polarization pages, and $\Delta_{\rm disp}$ is a residual error in the dispersion model that is defined as: $$
- \Delta_{\rm disp} = E{(2)}_{\rm disp} - E{(2)}_{\rm disp}(C_{n}). 
$$ We expect this to be mainly a short-range residual though there isn't a complelling reason to believe that it will always be the case. Still, we wil try to absorb this typically small residual in the short-range fit.
Notice that $$
- E{(1-\infty)}_{\rm int} = E{(1-\infty)}_{\rm sr} + E{(1)}_{\rm elst}({\rm DMA}) + E{(2)}_{\rm disp}(C_{n}) + {\rm Pol}^{(2-\infty)}, 
$$ that is, if we add the multipole-expanded electrostatic, polarization and dispersion energies to $E^{(1-\infty)}_{\rm sr}$, we get the total infinite-order interaction energy.
Data files required for fitting $E^{(1-\infty)}_{\rm sr}$
In principle, all we really need to do is define our electrostatic model (as for the first-order short-range fit), dispersion model and polarization model. As with the first-order fit, we may expect that **Orient** should then subtract the polarization, dispersion and electrostatic energies calculated using the multipole models and the remainder, $E^{(1-\infty)}_{\rm sr}$, would then be exactly the energy we'd want to fit to. Alas, this is not the case. For reasons I do not yet understand, while **Orient** will subtract off the multipole-expanded electrostatic and dispersion energies, it will not do the same for the multipole-expanded polarization energy. So we need to do this by hand and give **Orient**: $$
- E{(1-\infty)}_{\rm int-nopol} = E{(1-\infty)}_{\rm int} - {\rm Pol}^{(2-\infty)} 
$$
Note
We could also play it safe and subtract all multipole-expanded energies from the total interaction energy. If this were done the **Orient** command file would not contain any multipole model. While this would make it simpler, it can also be a little cumbersome as we'd need to modify the energy file each time we changed the multipole models. If we took this approach, **CamCASP** will need to subtract all multipole-expanded energies from the total interaction energy. That is, in the example below we subtract not only Eind(MP), but also Eelst(MP) and Edisp(MP). And the **Orient** input file should not include any of the multipole models.
Of course, we do not do this by hand, but use **CamCASP** to calculate this for us. But to make this possible, we first need to give **CamCASP** the infinite-order polarization energy ${\rm Pol}^{(2-\infty)}$.
To do this, we first need to calculate ${\rm Pol}^{(2-\infty)}$ for all dimer configurations. Instructions for this can be found on the Orient: Energy Calculations and Polarization & Polarization Models pages.
Next, we insert these energies into the Energy-file. In our water example, this file contains a column headed //Eind-MP//. The infinite-order polarization energies need to go under this column. (In older versions of this file, the column may be titled //E2ind-Asymp//.)
Once the ${\rm Pol}^{(2-\infty)}$ energies are included in the energy file, we use **CamCASP** to create the **Orient** input files using the following modification of water dimer : CamCASP Overlap Model input file
Begin Overlap-model Units kJ/mol ! == Only-Total-Orient No-Weights ! == Weight None ! We are fitting to total energy including Delta: ! BUT we must exclude the induction energy as ORIENT requires it: Energy = E1exch + E1elst + E2exch + E2ind - E2ind(as) + E2disp + Delta Emin = -100.0 Emax = 150.0 ! Energy-file Eint-H2O2-aTZ-2048-camcasp.dat Overlap-file ovr-H2O2-daTZ-tzvpp-GELSS-2048.dat ! Symmetrize (Debug) End
Notice that the only multipole-expanded energies included in the energy definition are those from the induction/polarization energy. By contrast, when fitting the overlap model to first-order short-range energies we had subtracted out the multipole-expanded electrostatic energy. Both methods are permissable.
Also, we no longer need to calculate the overlap model. All we need **CamCASP** to do is create the correct **Orient** input file for us. In a way, we are using a fairly complicated code (**CamCASP** is over 130,000 lines long) to perform a rather simple parsing task! But that's OK as this maintains a consistency in our procedure. We also benefit from the error checking done by **CamCASP**.
We do not want any weights in the **Orient** file as we will use the Boltzmann weighting scheme.
Finally, our energy range has been expanded to include all negative energies too. Adjust it as you see fit.
Note
As a rule of thumb, if $E_0$ is the most negative interaction energy in the data set, use a range of $E_0$ to $5 |E_0|$.
Important
Errors are easily possible in this procedure. Besides the usual errors associated with the wrong axis definitions or wrong types/sites, here we have the additionaly possibility that the multipole-expanded polarization energies in the energy file may not be consistent with the polarization model defined in the Orient file! It is best to label all files and directories with the polarization model used. So when (and you will!) change the polarization model, fewer errors result.
Once **CamCASP** has created the **Orient** input file do the following (for help on creating the **Orient** input file see Orient: Energy Calculations):
- Check to see that the energies included are indeed the energies you require. Best to check a couple of cases by hand.
- Include the electrostatic model (the DMA moments).
- **Do not include a polarization model.** You have already subtracted out this energy.
- Include your dispersion model in the **Pairs** section. Also don't forget the dispersion damping parameters.
- Include the short-range potential obtained from the first-order fit. It is possible to merge the short-range and dispersion potentials into one file using the **Cluster** program. See the **CamCASP** manual for details.
- Check the variables in the **Adjust** block. These should be the ones you want to relax. Comment out variables you wish to keep fixed. You could also use very tight anchors to fix them.
- Include an **anchors** file. You can create one from the potential file using the **anchors.pl** script. This script has a help page.
- Set Boltzmann weights. This line may need to be un-commented.
- Something like //Weights Boltzmann $N_w \times |E_0|$ Above $E_0$// should do. 
- Run **Orient** and see what you get.
- You will need to play with the anchor values and $N_w$ to get a decent fit. 
