<<TableOfContents()>>

= Iterated stockholder atoms (ISA) =

{{{#!wiki  warning
Warning

These notes on the ISA were begun a long time ago (pre-CamCASP 5.7!) and while some of this material is still correct, some of it is outdated. Use this page with caution!
}}}

== Theory ==

$$  \rho^{a}(r) = \rho(r) \times \frac{w^{a}(r)}{\sum_b w^{b}(r)}$$

and

$$  w^{a}(r) = \langle \rho^{a}(r) \rangle_{{\rm sph. avg}} $$

I will describe the theory of the CamCASP implementation later.

== Convergence problems with the ISA ==
Some systems have proved difficult to converge with the current ISA algorithm in CamCASP 6.0.x (this is, as yet, a pre-release version) and prior versions of the code. It is not yet possible to say for sure when we may expect a convergence issue, but there are often clues that can help decide how to enable convergence.

Possible causes for failure to converge include:
  * Iteration limit exceeded: If too few iterations have been allowed, it is sometimes possible for a slowly convergent case to fail. If you believe that the ISA is converging, albeit slowly, then consider increasing the allowed number of iterations set by the '''MAX-ITERATIONS''' command.
  * NANs appear in the tables of atomic charges: There are cases for which the ISA initially appears to converge, but after a point NANs appear in the '''W0-norm''' and '''Charge''' tables in the CamCASP output. This can happen when the tail-weighting mechanism kicks in if $\epsilon_{\rm w}$ is too large. This parameter is set by the '''W-EPS''' command and must be chosen to be smaller than twice the smallest exponent in the s-basis set used for the ISA expansions. 
  * Garbage in the ISA solution: If you see completely nonsensical charges or a complete break in symmetry, then look at the density-fitting that precedes the ISA. Check to see that the total density does integrate to the expected number of electrons. It is sometimes possible for the molecular orbitals to be inconsistent with the molecular geometry. If this happens, then the ISA (or any part of CamCASP) will result in garbage.
  * Total charge is not converged in the ISA iterations. 
    * Total charge is normally conserved to better than $0.01e$ with the ISA-A algorithm. For a good ISA solution it will be conserved to better than $0.001e$. In principle, as the ISA basis set gets better at representing the ISA-AIM densities, the total charge will be better conserved. 
    * If errors in the total charge are of the order $0.1e$ or more, then you have a serious problem. One system where we see this is H$_2$S and we discuss this example below together with a procedure for solving the problem. In short, this seems to happen when the ISA-AIMs are more complex than can be described by the ISA basis sets, or when the molecular density cannot be accurately represented by the ISA basis sets.
    * Errors  in the total charge of the order $0.1e$ can also occur for strongly delocalised systems when the default '''NEIGHBOURHOOD''' definition does not result in a large enough neighbourhood. If you suspect this may be the case, then use the '''EDIT...END''' block (usually just after the '''MOLECULE''' block in the CamCASP command file) and alter the neighbourhood definition to result in a larger neighbourhood.

=== Case study in ISA convergence : H$_2$S ===
This seemingly simple molecule posed a problem for the ISA algorithm in CamCASP. This may be related to the unusual geometry of H$_2$S with the rather small bond angle of $92^{\circ}$. In H$_2$O this angle is slightly more than $104^{\circ}$. Anyway the problem is that when you use the default basis sets for an ISA-A calculation in CamCASP 6.0.x the ISA cannot converge and charge violations are very large at around $0.8e$. The basis set used for this was:
  * Main: aug-cc-pVTZ, Spherical GTOs
  * Aux: aug-cc-pvTZ, Cartesian GTOs
  * Atom-Aux: aug-cc-pVQZ, Spherical GTOs with ISA-set2 for the s-functions.

Initially I tried increasing the size of the Atom-Aux basis to see if additional flexibility would result in convergence, but it didn't. Then, on a hunch, I reduced the size of both the Aux and Atom-Aux basis sets to include only s- and p-functions and this worked very well. So why were the higher angular functions causing a problem? 

The answer seems to be because the Aux basis used Cartesian GTOs, and if, as clearly was the case here, the additional flexibility of a Cartesian GTO set was needed to describe something crucial in the system that no Atom-Aux basis with Spherical GTOs could ever describe, no matter how large and diffuse. So I changed the Aux basis to use Spherical GTOs and this resulted in a rapidly convergent ISA-A solution. But this also led to a loss in accuracy. However this was not a big problem as there is a simple way to recover from this loss. Here's the full procedure:

  * Step (1): Get a good enough ISA-A solution using a limited flexibility Aux basis:
    * Aux: Spherical GTOs with ISA-set2 for s-functions (very important!)
    * Atom-Aux: Spherical GTOs also with ISA-set2. The basis used could be larger than the Aux basis.
  * Step (2): Re-start the ISA from the solution obtained in step (1); this time use the proper Aux basis:
    * Aux: Cartesian GTOs. No restriction on the s-function set. Any can be used.
    * Atom-Aux: Spherical GTOs. s-functions need to be the same as used in step (1), i.e., ISA-set2.

{{{#!wiki  important
Important
Solution (A) for convergence failure and large charge violations:
  * Step (1) can be achieved with '''METHOD isa-A'''. Edit the Cluster command file to make sure that basis sets are appropriate.
  * Step (2) can be achieved with '''METHOD isa-A-restart'''. 
}}}

= CamCASP 5.7 notes follow =
Most of these notes still hold for later versions of the ISA implementation in CamCASP.

== CamCASP commands ==
{{{
Begin Stockholder
  Molecule <NAME | default is first molecule>
  ISA-TYPE [DENSITY (default) | {TRANSITION-DENSITY | OV}]
  DF [=] [DOO | DOO-C | DRHO (default) | DRHO-C]
  W-INIT [ ALL-GTOS | ONE-GTO [ALPHA0 [=] <alpha0>] | DF (default) | DRHO-C ] 
  ISA-Algorithm [A1 | A2 | B | DF+ISA [ZETA [=] <zeta_ISA>]]
  ISA-Alpha-Min [ 0.0 (default)]
  SOLVER    [LU (default) [[with] ITERATIONS <iterations (default=0)>]]
    or
  SOLVER    [GELSS [[with] CONDITION <condition number (default = 1e-15)>]]
  CONVERGENCE
    CONVERGENCE-TYPE [ W | RHO | {CHARGE | Q} ]
    W-DAMPING [=] <EtaDamp>
    EPS-NORM [=] <eps-norm (default = 1e-9)>
    EPS-Q    [=] <eps-Q (default = 1e-3)>
    MAX-ITERATIONS [=] <max iterations (default = 120)>
    W-MIX-FRACTION [=] <value (default = 0.0)> +++
        [SKIP-ITERATIONS [=] <number (default = 0)>]
    W-EPS [=] <value (default = 0.2)> [COUPLE (default) | DECOUPLE] +++
       [S-BLOCK-ONLY (default) | ALL-BLOCKS] +++
       [SWITCH-ON-EPS-NORM [=] <value | default is 1.0e-5>
    TAIL-ITERATIONS [=] <number of tail iterations (default = 0)>
  END
  W-TAILS
    [NO-FIX] ( default is to fix the tails )
    R1-MULTIPLIER [=] <value (default = 1.5)>
    R2-MULTIPLIER [=] <value (default = 3.5)>
    FUNC [=] [+-1 | +-2] (default = 1)
    [TEST]
    FIT-TYPE [=] [1 | 2 | 3 (default)]
  END
  Symmetry [ON | OFF (default)]
End
}}}
A brief explanation of some of these commands:
  * '''DF''': Sets the DF solution used. 
  * '''W-INIT''': Set the starting point for the shape functions $w^a$. The solution we converge to is uneffected by the choice of starting point, but if there are convergence issues (for large basis sets), it is best to start from the '''Drho-C''' solution obtained with the Gamma-constraint (see example below).
  * '''ISA-ALGORITHM''': Should be '''DF+ISA'''. The parameter '''ZETA''' controls the weight given to the ISA compared with the DF. Modify this as needed. Usually, with a flexible $s$-basis (not the standard DF basis sets!) a value in the range $\zeta \in [0.1,0.9]$ works well. The larger $\zeta$ is the closer to the ISA solution we are, but charge violations get larger ($\mathcal{O}(10^{-3})$).
  * '''SOLVER''': Use '''GELSS''' with '''Condition = 1e-10''' or so.
  * '''CONVERGENCE..END''': Use either the '''W''' or '''Q''' types. The former will check the norm of the shape functions and the latter will check the site charges. Thresholds are given by '''EPS-NORM''' and '''EPS-Q'''.
  * '''W-TAILS..END''': Tests and fixes the shape-function tails - which can cause us headaches! I've noticed this module needs to be made more robust, but it generally works well. The '''FIT_TYPE = 3''' imposes charge conservation. This is very important.
  * '''W-EPS [=]''' $\epsilon_w$: This sets an additional tail weighting in the ISA functional. Applies to the ISA types DF-ISA and B. We must have $\epsilon_w < 2 \min{\alpha_k, k \in {\rm Aux basis}}$. The default is to use $\epsilon_w = 0.2$, apply it to the S-block only and couple it with the tail fix after the tails have converged to 1.0e-5 (no units).


A sensible set of commands is:
{{{
Begin Stockholder
  Molecule <NAME>
  DF     = Drho
  W-INIT = Drho-C
  ISA-Algorithm DF+ISA  Zeta = 0.9
  Solver GELSS Condition = 1e-10
  Convergence
    Convergence-Type W
    EPS-Norm = 1.0e-09
    EPS-Q    = 1.0e-4
    Max-Iter = 100
    W-Eps = 0.2 ( appropriate for ISA/set1 and ISA/set2 )
  END
  W-TAILS
    Func = 1
    R1-Multiplier = 2.0
    R2-Multiplier = 3.0
    Fit-Type = 3
    W-Tests
  END
  Symmetrize
End
}}}
This requires that you have done a DF (with TYPE = DRHO) without constraints (to get Drho) and a DF (also TYPE = DRHO) with the Gamma constraint to get Drho-C to initiate the shape functions. It should be possible to initiate the shape functions with Drho too.

{{{#!wiki  important
Important

Caution: at present the code assumes you have not used the lambda constraint in the DF. 
}}}
=== Auxiliary basis sets ===
  * These must use spherical GTOs. The Cartesian GTOs (the CamCASP default) will cause the entire procedure to fail as the spherical average step assumes that spherically symmetric functions are all $s$ functions.
  * The default auxiliary basis sets are optimized for energies. They do not describe the density tails well. Instead we use a user-defined set of basis functions that have a more flexible set of $s$ functions.
  * With this more flexible set, there are sometimes small instabilities in the description of the hydrogen atoms. These can often be removed using //Limit Min-S-Exp = 0.2// in the auxiliary basis description of the molecule. This limits the hydrogen basis sets to $s$ functions with exponent greater than 0.2. In extreme cases I have had to increase this to 0.4. 
  * To keep numerical noise low, always use GELSS to solve the linear equations.

Here is the current set of $s$ functions that I use:
{{{
   1  s
 256.0       1.0
   1  s
 128.0       1.0
   1  s
  64.0       1.0
   1  s
  32.0                  1.0
   1  s
  16.0                  1.0
   1  s
   8.0                  1.0
   1  s
   4.0                  1.0
   1  s
   2.0                  1.0
   1  s
   1.0                  1.0
   1  s
   0.5                  1.0
   1  s
   0.25                 1.0
   1  s
   0.125                1.0
}}}
All you need to do is 
  * copy the basis set you want to //$CAMCASP/basis/auxiliary/user-def/ //
  * edit it to remove all $s$ functions and include the above block. 
  * save
  * In the CamCASP input file set the path to the auxiliary basis sets as //#include-camcasp basis/auxiliary/user-def/<NAME> //
  
Examples: [[ajm/camcasp/isa/basis/H|H]], [[ajm/camcasp/isa/basis/O|O]]

=== Mechanism for merging ISA and DF basis sets ===
We now have a mechanism for merging ISA and DF basis sets. This can be done in the CamCASP command file using:
{{{
  O1         8.0        0.00000000       0.00000000       0.00000000  TYPE O1
    Limit G
    #include-camcasp basis/auxiliary/ISA/set2/O
    Symmetry Exclude = S
    #include-camcasp basis/auxiliary/aug-cc-pVTZ/O
  ---
}}}
Here the ISA/set2 basis is used for the $s$-functions in oxygen atom O1 and the higher symmetry functions are taken from the aug-cc-pVTZ DF basis. 

This can be done using the Cluster input file using commands like:
{{{
Files
  Properties
  Molecule H2O2
  !
  Functional PBE0
  Kernel ALDX+CHF   ( Suppress DALTON's CKS )
  !
  Basis aTZ Type DC  Spherical
  Aux-Basis aTZ Type DC Spherical  ( ISA needs Spherical GTOs in the aux basis )
  ISA-Aux set2
  ...
  ...
End
}}}
Here we have specified both the standard DF basis (aug-cc-pVTZ == aTZ) and the ISA basis (set2). 

=== ISA basis sets ===
  * set1: less diffuse functions on the H and C atoms. Better behaved if '''W-EPS = 0.0'''.
  * set2: All atoms contain diffuse functions. Use this only with '''W-EPS = 0.2''' (the exact value may change).
== Examples ==
In general, these calculations proceed as follows:
  * Perform a standard properties calculation.
  * Modify the CamCASP input file as follows:
    * Make the auxiliary basis SPHERICAL. 
    * Use a User-Defined auxiliary basis. These are described above.
    * Use the following commands after the molecule description. Modify names as appropriate:
{{{
BEGIN GRID
  Molecule H2O
  Angular 200
  Radial  40
END

SET Lattice
  Charge   1.0
  LoLim    2.0
  HiLim    4.0
  Random   2000
  Seed     1
END

SET DF-INTEGRALS
  DF-TYPE-MONOMER NN
END

BEGIN DF
  Molecule H2O
  Type RHO
  Eta = 0.0
  Lambda = 0.0
  Gamma = 0.0
  Print only normalization constraints
END

Begin Multipoles
  Molecule H2O
  DF Type RHO without constraints
  Rank 4
End
BEGIN DF
  Molecule H2O
  Type RHO
  Eta = 0.0
  Lambda = 0.0
  Gamma = 1e-4  DeltaZ = 0.9
  Print only normalization constraints
END

Begin Multipoles
  Molecule H2O
  DF Type RHO with constraints
  Rank 4
End

Begin Stockholder
  Molecule H2O
  DF = Drho
  W-INIT = Drho-C
  ISA-Algorithm DF+ISA  Zeta = 0.05
  ISA-Alpha-Min 0.0
  Solver GELSS Condition = 1e-08
  Convergence
    Convergence-Type Q
    EPS-Norm = 1.0e-10
    EPS-Q    = 1.0e-4
    Max-Iter = 160
    W-Damping = 0.0
    W-Mix-Fraction = 0.0  Skip-Iterations = 20
  End
  W-TAILS
    Func = 1
    R1-Multiplier = 2.0
    R2-Multiplier = 3.0
    Fit-Type = 3
    W-Tests
  END
  ! Verbose
  ! Debug
  ! Debug Extreme
End

Begin Multipoles
  Molecule H2O
  DF Type ISA
  Rank 4
End

Finish

}}}
=== H2O ===
Here is a water CamCASP command file: [[ajm/camcasp/isa/examples/h2o-atz|H2O-aTZ-ISA.cks]]

And here is the output: [[ajm/camcasp/isa/examples/h2o-atz-output|H2O-aTZ-ISA_camcasp.out]]

The calculation will produce files //w-*.dat// with the shape-functions in them. These are often worth looking at. Plot columns 2:3 for the shape functions. Column 4 contains the function used to fix the tail. Column 5 contains the integrated charge from that point to infinity. The other columns contain various parametes I use for judging the tail fix procedure.

Here are the DF-ISA moments for this example:
{{{
! Multipole moments for H2O
! Based on DF-type : ISA
 
   O     0.00000000     0.00000000     0.00000000     Type    O      Rank   4
      -0.832451
       0.202168      -0.000000       0.000000
      -0.037844       0.000000      -0.000000       0.453004      -0.000000
      -0.113196       0.000000       0.000000       0.197884      -0.000000       0.000000
     -0.000000
       0.027601       0.000000       0.000000      -0.084861      -0.000000      -0.000000
     -0.000000       0.008559      -0.000000
 
  H1    -1.45365196     0.00000000    -1.12168732     Type    H      Rank   4
       0.415973
      -0.004289      -0.037414      -0.000000
       0.014214      -0.000821       0.000000       0.017303       0.000000
       0.058745      -0.070990       0.000000      -0.042183       0.000000      -0.036665
      0.000000
       0.000000       0.000000       0.000000      -0.000000       0.000000      -0.000000
     -0.000000      -0.000000       0.000000
 
  H2     1.45365196     0.00000000    -1.12168732     Type    H      Rank   4
       0.415973
      -0.004289       0.037414      -0.000000
       0.014214       0.000821       0.000000       0.017303      -0.000000
       0.058745       0.070990       0.000000      -0.042183      -0.000000       0.036665
      0.000000
      -0.000000      -0.000000      -0.000000       0.000000       0.000000       0.000000
     -0.000000      -0.000000       0.000000
}}}

And these are the DMA2 moments for the same molecule:

{{{
!  H2O
!  Basis: aug-cc-pVTZ

O           0.0000000000    0.0000000000    0.0000000000    Rank  4 Type O
  -0.4005512259
  -0.3786892921   0.0000000000   0.0000000000
   0.1330626405   0.0000000000   0.0000000000   1.5247365276   0.0000000000
   1.2476387160   0.0000000000   0.0000000000  -3.0147271572   0.0000000000
   0.0000000000   0.0000000000
  -3.7837242942   0.0000000000   0.0000000000   5.2979111238   0.0000000000
   0.0000000000   0.0000000000   3.0778615682   0.0000000000

H1         -1.4536519600    0.0000000000   -1.1216873200    Rank  1 Type H
   0.2002756327
   0.0438156156   0.0181547021   0.0000000000

H2          1.4536519600    0.0000000000   -1.1216873200    Rank  1 Type H
   0.2002756327
   0.0438156156  -0.0181547021   0.0000000000

}}}

It is interesting to see how they compare using the energy maps. [[ajm/camcasp/isa/examples/h2o_display_orient|The Orient file for energy maps is here]].

Even more interesting is how these sets of multipoles behave as we truncate them. The DF-ISA moments exhibit a systematic loss in accuracy, but not the DMA2 moments. This is even more dramatic for pyridine.