17904
Comment:

← Revision 11 as of 20210414 11:39:39 ⇥
17847

Deletions are marked like this.  Additions are marked like this. 
Line 1:  Line 1: 
#acl +All:read apw185:read,write,delete Known:read All: 
Contents
Iterated stockholder atoms (ISA)
Warning
These notes on the ISA were begun a long time ago (preCamCASP 5.7!) and while some of this material is still correct, some of it is outdated. Use this page with caution!
Theory
$$ \rho^{a}(r) = \rho(r) \times \frac{w^{a}(r)}{\sum_b w^{b}(r)}$$
and
$$ w^{a}(r) = \langle \rho^{a}(r) \rangle_{{\rm sph. avg}} $$
I will describe the theory of the CamCASP implementation later.
Convergence problems with the ISA
Some systems have proved difficult to converge with the current ISA algorithm in CamCASP 6.0.x (this is, as yet, a prerelease version) and prior versions of the code. It is not yet possible to say for sure when we may expect a convergence issue, but there are often clues that can help decide how to enable convergence.
Possible causes for failure to converge include:
Iteration limit exceeded: If too few iterations have been allowed, it is sometimes possible for a slowly convergent case to fail. If you believe that the ISA is converging, albeit slowly, then consider increasing the allowed number of iterations set by the MAXITERATIONS command.
NANs appear in the tables of atomic charges: There are cases for which the ISA initially appears to converge, but after a point NANs appear in the W0norm and Charge tables in the CamCASP output. This can happen when the tailweighting mechanism kicks in if $\epsilon_{\rm w}$ is too large. This parameter is set by the WEPS command and must be chosen to be smaller than twice the smallest exponent in the sbasis set used for the ISA expansions.
 Garbage in the ISA solution: If you see completely nonsensical charges or a complete break in symmetry, then look at the densityfitting that precedes the ISA. Check to see that the total density does integrate to the expected number of electrons. It is sometimes possible for the molecular orbitals to be inconsistent with the molecular geometry. If this happens, then the ISA (or any part of CamCASP) will result in garbage.
 Total charge is not converged in the ISA iterations.
Total charge is normally conserved to better than $0.01e$ with the ISAA algorithm. For a good ISA solution it will be conserved to better than $0.001e$. In principle, as the ISA basis set gets better at representing the ISAAIM densities, the total charge will be better conserved.
If errors in the total charge are of the order $0.1e$ or more, then you have a serious problem. One system where we see this is H$_2$S and we discuss this example below together with a procedure for solving the problem. In short, this seems to happen when the ISAAIMs are more complex than can be described by the ISA basis sets, or when the molecular density cannot be accurately represented by the ISA basis sets.
Errors in the total charge of the order $0.1e$ can also occur for strongly delocalised systems when the default NEIGHBOURHOOD definition does not result in a large enough neighbourhood. If you suspect this may be the case, then use the EDIT...END block (usually just after the MOLECULE block in the CamCASP command file) and alter the neighbourhood definition to result in a larger neighbourhood.
Case study in ISA convergence : H$_2$S
This seemingly simple molecule posed a problem for the ISA algorithm in CamCASP. This may be related to the unusual geometry of H$_2$S with the rather small bond angle of $92^{\circ}$. In H$_2$O this angle is slightly more than $104^{\circ}$. Anyway the problem is that when you use the default basis sets for an ISAA calculation in CamCASP 6.0.x the ISA cannot converge and charge violations are very large at around $0.8e$. The basis set used for this was:
 Main: augccpVTZ, Spherical GTOs
 Aux: augccpvTZ, Cartesian GTOs
 AtomAux: augccpVQZ, Spherical GTOs with ISAset2 for the sfunctions.
Initially I tried increasing the size of the AtomAux basis to see if additional flexibility would result in convergence, but it didn't. Then, on a hunch, I reduced the size of both the Aux and AtomAux basis sets to include only s and pfunctions and this worked very well. So why were the higher angular functions causing a problem?
The answer seems to be because the Aux basis used Cartesian GTOs, and if, as clearly was the case here, the additional flexibility of a Cartesian GTO set was needed to describe something crucial in the system that no AtomAux basis with Spherical GTOs could ever describe, no matter how large and diffuse. So I changed the Aux basis to use Spherical GTOs and this resulted in a rapidly convergent ISAA solution. But this also led to a loss in accuracy. However this was not a big problem as there is a simple way to recover from this loss. Here's the full procedure:
 Step (1): Get a good enough ISAA solution using a limited flexibility Aux basis:
 Aux: Spherical GTOs with ISAset2 for sfunctions (very important!)
 AtomAux: Spherical GTOs also with ISAset2. The basis used could be larger than the Aux basis.
 Step (2): Restart the ISA from the solution obtained in step (1); this time use the proper Aux basis:
 Aux: Cartesian GTOs. No restriction on the sfunction set. Any can be used.
 AtomAux: Spherical GTOs. sfunctions need to be the same as used in step (1), i.e., ISAset2.
Important Solution (A) for convergence failure and large charge violations:
Step (1) can be achieved with METHOD isaA. Edit the Cluster command file to make sure that basis sets are appropriate.
Step (2) can be achieved with METHOD isaArestart.
CamCASP 5.7 notes follow
Most of these notes still hold for later versions of the ISA implementation in CamCASP.
CamCASP commands
Begin Stockholder Molecule <NAME  default is first molecule> ISATYPE [DENSITY (default)  {TRANSITIONDENSITY  OV}] DF [=] [DOO  DOOC  DRHO (default)  DRHOC] WINIT [ ALLGTOS  ONEGTO [ALPHA0 [=] <alpha0>]  DF (default)  DRHOC ] ISAAlgorithm [A1  A2  B  DF+ISA [ZETA [=] <zeta_ISA>]] ISAAlphaMin [ 0.0 (default)] SOLVER [LU (default) [[with] ITERATIONS <iterations (default=0)>]] or SOLVER [GELSS [[with] CONDITION <condition number (default = 1e15)>]] CONVERGENCE CONVERGENCETYPE [ W  RHO  {CHARGE  Q} ] WDAMPING [=] <EtaDamp> EPSNORM [=] <epsnorm (default = 1e9)> EPSQ [=] <epsQ (default = 1e3)> MAXITERATIONS [=] <max iterations (default = 120)> WMIXFRACTION [=] <value (default = 0.0)> +++ [SKIPITERATIONS [=] <number (default = 0)>] WEPS [=] <value (default = 0.2)> [COUPLE (default)  DECOUPLE] +++ [SBLOCKONLY (default)  ALLBLOCKS] +++ [SWITCHONEPSNORM [=] <value  default is 1.0e5> TAILITERATIONS [=] <number of tail iterations (default = 0)> END WTAILS [NOFIX] ( default is to fix the tails ) R1MULTIPLIER [=] <value (default = 1.5)> R2MULTIPLIER [=] <value (default = 3.5)> FUNC [=] [+1  +2] (default = 1) [TEST] FITTYPE [=] [1  2  3 (default)] END Symmetry [ON  OFF (default)] End
A brief explanation of some of these commands:
DF: Sets the DF solution used.
WINIT: Set the starting point for the shape functions $w^a$. The solution we converge to is uneffected by the choice of starting point, but if there are convergence issues (for large basis sets), it is best to start from the DrhoC solution obtained with the Gammaconstraint (see example below).
ISAALGORITHM: Should be DF+ISA. The parameter ZETA controls the weight given to the ISA compared with the DF. Modify this as needed. Usually, with a flexible $s$basis (not the standard DF basis sets!) a value in the range $\zeta \in [0.1,0.9]$ works well. The larger $\zeta$ is the closer to the ISA solution we are, but charge violations get larger ($\mathcal{O}(10^{3})$).
SOLVER: Use GELSS with Condition = 1e10 or so.
CONVERGENCE..END: Use either the W or Q types. The former will check the norm of the shape functions and the latter will check the site charges. Thresholds are given by EPSNORM and EPSQ.
WTAILS..END: Tests and fixes the shapefunction tails  which can cause us headaches! I've noticed this module needs to be made more robust, but it generally works well. The FIT_TYPE = 3 imposes charge conservation. This is very important.
WEPS [=] $\epsilon_w$: This sets an additional tail weighting in the ISA functional. Applies to the ISA types DFISA and B. We must have $\epsilon_w < 2 \min{\alpha_k, k \in {\rm Aux basis}}$. The default is to use $\epsilon_w = 0.2$, apply it to the Sblock only and couple it with the tail fix after the tails have converged to 1.0e5 (no units).
A sensible set of commands is:
Begin Stockholder Molecule <NAME> DF = Drho WINIT = DrhoC ISAAlgorithm DF+ISA Zeta = 0.9 Solver GELSS Condition = 1e10 Convergence ConvergenceType W EPSNorm = 1.0e09 EPSQ = 1.0e4 MaxIter = 100 WEps = 0.2 ( appropriate for ISA/set1 and ISA/set2 ) END WTAILS Func = 1 R1Multiplier = 2.0 R2Multiplier = 3.0 FitType = 3 WTests END Symmetrize End
This requires that you have done a DF (with TYPE = DRHO) without constraints (to get Drho) and a DF (also TYPE = DRHO) with the Gamma constraint to get DrhoC to initiate the shape functions. It should be possible to initiate the shape functions with Drho too.
Important
Caution: at present the code assumes you have not used the lambda constraint in the DF.
Auxiliary basis sets
These must use spherical GTOs. The Cartesian GTOs (the CamCASP default) will cause the entire procedure to fail as the spherical average step assumes that spherically symmetric functions are all $s$ functions.
The default auxiliary basis sets are optimized for energies. They do not describe the density tails well. Instead we use a userdefined set of basis functions that have a more flexible set of $s$ functions.
With this more flexible set, there are sometimes small instabilities in the description of the hydrogen atoms. These can often be removed using //Limit MinSExp = 0.2// in the auxiliary basis description of the molecule. This limits the hydrogen basis sets to $s$ functions with exponent greater than 0.2. In extreme cases I have had to increase this to 0.4.
 To keep numerical noise low, always use GELSS to solve the linear equations.
Here is the current set of $s$ functions that I use:
1 s 256.0 1.0 1 s 128.0 1.0 1 s 64.0 1.0 1 s 32.0 1.0 1 s 16.0 1.0 1 s 8.0 1.0 1 s 4.0 1.0 1 s 2.0 1.0 1 s 1.0 1.0 1 s 0.5 1.0 1 s 0.25 1.0 1 s 0.125 1.0
All you need to do is
 copy the basis set you want to //$CAMCASP/basis/auxiliary/userdef/ //
edit it to remove all $s$ functions and include the above block.
 save
In the CamCASP input file set the path to the auxiliary basis sets as //#includecamcasp basis/auxiliary/userdef/<NAME> //
Mechanism for merging ISA and DF basis sets
We now have a mechanism for merging ISA and DF basis sets. This can be done in the CamCASP command file using:
O1 8.0 0.00000000 0.00000000 0.00000000 TYPE O1 Limit G #includecamcasp basis/auxiliary/ISA/set2/O Symmetry Exclude = S #includecamcasp basis/auxiliary/augccpVTZ/O 
Here the ISA/set2 basis is used for the $s$functions in oxygen atom O1 and the higher symmetry functions are taken from the augccpVTZ DF basis.
This can be done using the Cluster input file using commands like:
Files Properties Molecule H2O2 ! Functional PBE0 Kernel ALDX+CHF ( Suppress DALTON's CKS ) ! Basis aTZ Type DC Spherical AuxBasis aTZ Type DC Spherical ( ISA needs Spherical GTOs in the aux basis ) ISAAux set2 ... ... End
Here we have specified both the standard DF basis (augccpVTZ == aTZ) and the ISA basis (set2).
ISA basis sets
set1: less diffuse functions on the H and C atoms. Better behaved if WEPS = 0.0.
set2: All atoms contain diffuse functions. Use this only with WEPS = 0.2 (the exact value may change).
Examples
In general, these calculations proceed as follows:
 Perform a standard properties calculation.
 Modify the CamCASP input file as follows:
 Make the auxiliary basis SPHERICAL.
 Use a UserDefined auxiliary basis. These are described above.
 Use the following commands after the molecule description. Modify names as appropriate:
BEGIN GRID Molecule H2O Angular 200 Radial 40 END SET Lattice Charge 1.0 LoLim 2.0 HiLim 4.0 Random 2000 Seed 1 END SET DFINTEGRALS DFTYPEMONOMER NN END BEGIN DF Molecule H2O Type RHO Eta = 0.0 Lambda = 0.0 Gamma = 0.0 Print only normalization constraints END Begin Multipoles Molecule H2O DF Type RHO without constraints Rank 4 End BEGIN DF Molecule H2O Type RHO Eta = 0.0 Lambda = 0.0 Gamma = 1e4 DeltaZ = 0.9 Print only normalization constraints END Begin Multipoles Molecule H2O DF Type RHO with constraints Rank 4 End Begin Stockholder Molecule H2O DF = Drho WINIT = DrhoC ISAAlgorithm DF+ISA Zeta = 0.05 ISAAlphaMin 0.0 Solver GELSS Condition = 1e08 Convergence ConvergenceType Q EPSNorm = 1.0e10 EPSQ = 1.0e4 MaxIter = 160 WDamping = 0.0 WMixFraction = 0.0 SkipIterations = 20 End WTAILS Func = 1 R1Multiplier = 2.0 R2Multiplier = 3.0 FitType = 3 WTests END ! Verbose ! Debug ! Debug Extreme End Begin Multipoles Molecule H2O DF Type ISA Rank 4 End Finish
H2O
Here is a water CamCASP command file: H2OaTZISA.cks
And here is the output: H2OaTZISA_camcasp.out
The calculation will produce files //w*.dat// with the shapefunctions in them. These are often worth looking at. Plot columns 2:3 for the shape functions. Column 4 contains the function used to fix the tail. Column 5 contains the integrated charge from that point to infinity. The other columns contain various parametes I use for judging the tail fix procedure.
Here are the DFISA moments for this example:
! Multipole moments for H2O ! Based on DFtype : ISA O 0.00000000 0.00000000 0.00000000 Type O Rank 4 0.832451 0.202168 0.000000 0.000000 0.037844 0.000000 0.000000 0.453004 0.000000 0.113196 0.000000 0.000000 0.197884 0.000000 0.000000 0.000000 0.027601 0.000000 0.000000 0.084861 0.000000 0.000000 0.000000 0.008559 0.000000 H1 1.45365196 0.00000000 1.12168732 Type H Rank 4 0.415973 0.004289 0.037414 0.000000 0.014214 0.000821 0.000000 0.017303 0.000000 0.058745 0.070990 0.000000 0.042183 0.000000 0.036665 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 H2 1.45365196 0.00000000 1.12168732 Type H Rank 4 0.415973 0.004289 0.037414 0.000000 0.014214 0.000821 0.000000 0.017303 0.000000 0.058745 0.070990 0.000000 0.042183 0.000000 0.036665 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
And these are the DMA2 moments for the same molecule:
! H2O ! Basis: augccpVTZ O 0.0000000000 0.0000000000 0.0000000000 Rank 4 Type O 0.4005512259 0.3786892921 0.0000000000 0.0000000000 0.1330626405 0.0000000000 0.0000000000 1.5247365276 0.0000000000 1.2476387160 0.0000000000 0.0000000000 3.0147271572 0.0000000000 0.0000000000 0.0000000000 3.7837242942 0.0000000000 0.0000000000 5.2979111238 0.0000000000 0.0000000000 0.0000000000 3.0778615682 0.0000000000 H1 1.4536519600 0.0000000000 1.1216873200 Rank 1 Type H 0.2002756327 0.0438156156 0.0181547021 0.0000000000 H2 1.4536519600 0.0000000000 1.1216873200 Rank 1 Type H 0.2002756327 0.0438156156 0.0181547021 0.0000000000
It is interesting to see how they compare using the energy maps. The Orient file for energy maps is here.
Even more interesting is how these sets of multipoles behave as we truncate them. The DFISA moments exhibit a systematic loss in accuracy, but not the DMA2 moments. This is even more dramatic for pyridine.