diff --git a/.buildinfo b/.buildinfo
index 20699d7a3..3aa8f27d9 100644
--- a/.buildinfo
+++ b/.buildinfo
@@ -1,4 +1,4 @@
# Sphinx build info version 1
# This file hashes the configuration used when building these files. When it is not found, a full rebuild will be done.
-config: 4c0a59c9c313036a6175c08aac00344b
+config: 08c9dca4d1441e69552430980fb76f53
tags: 645f666f9bcd5a90fca523b33c5a78b7
diff --git a/_static/documentation_options.js b/_static/documentation_options.js
index 17b0e4034..7ad5ba5cc 100644
--- a/_static/documentation_options.js
+++ b/_static/documentation_options.js
@@ -1,6 +1,6 @@
var DOCUMENTATION_OPTIONS = {
URL_ROOT: document.getElementById("documentation_options").getAttribute('data-url_root'),
- VERSION: '1.4.1',
+ VERSION: '1.4.2',
LANGUAGE: 'en',
COLLAPSE_INDEX: false,
BUILDER: 'html',
diff --git a/citing.html b/citing.html
index 99e9012d5..0a453a1f2 100644
--- a/citing.html
+++ b/citing.html
@@ -4,7 +4,7 @@
-
FLARE_Calculator is a calculator compatible with ASE.
-You can build up ASE Atoms for your atomic structure, and use get_forces,
-get_potential_energy as general ASE Calculators, and use it in
-ASE Molecular Dynamics and our ASE OTF training module. For the usage
-users can refer to ASE Calculator module
-and ASE Calculator tutorial.
Build FLARE as an ASE Calculator, which is compatible with ASE Atoms and
-Molecular Dynamics.
-:Parameters: * gp_model (GaussianProcess) – FLARE’s Gaussian process object
-
-
-
mgp_model (MappedGaussianProcess) – FLARE’s Mapped Gaussian Process
-object. None by default. MGP will only be used if use_mapping
-is set to True.
-
par (Bool) – set to True if parallelize the prediction. False by
-default.
-
use_mapping (Bool) – set to True if use MGP for prediction. False
-by default.
Gaussian process force field. Implementation is based on Algorithm 2.1
-(pg. 19) of “Gaussian Processes for Machine Learning” by Rasmussen and
-Williams.
-
Methods within GaussianProcess allow you to make predictions on
-AtomicEnvironment objects (see env.py) generated from
-FLARE FLARE_Atoms (see ase/atoms.py), and after data points are added,
-optimize hyperparameters based on available training data (train method).
-
-
Parameters
-
-
kernels (list, optional) – Determine the type of kernels. Example:
-[‘twbody’, ‘threebody’], [‘2’, ‘3’, ‘mb’], [‘2’]. Defaults to [
-‘twboody’, ‘threebody’]
-
component (str, optional) – Determine single- (“sc”) or multi-
-component (“mc”) kernel to use. Defaults to “mc”
-
hyps (np.ndarray, optional) – Hyperparameters of the GP.
-
cutoffs (Dict, optional) – Cutoffs of the GP kernel. For simple hyper-
-parameter setups, formatted like {“twobody”:7, “threebody”:4.5},
-etc.
-
hyp_labels (List, optional) – List of hyperparameter labels. Defaults
-to None.
-
opt_algorithm (str, optional) – Hyperparameter optimization algorithm.
-Defaults to ‘L-BFGS-B’.
-
maxiter (int, optional) – Maximum number of iterations of the
-hyperparameter optimization algorithm. Defaults to 10.
-
parallel (bool, optional) – If True, the covariance matrix K of the GP is
-computed in parallel. Defaults to False.
-
n_cpus (int, optional) – Number of cpus used for parallel
-calculations. Defaults to 1 (serial)
-
n_sample (int, optional) – Size of submatrix to use when parallelizing
-predictions.
-
output (Output, optional) – Output object used to dump hyperparameters
-during optimization. Defaults to None.
-
hyps_mask (dict, optional) – hyps_mask can set up which hyper parameter
-is used for what interaction. Details see kernels/mc_sephyps.py
-
name (str, optional) – Name for the GP instance which dictates global
-memory access.
Add a single local environment to the training set of the GP.
-
-
Parameters
-
-
env (AtomicEnvironment) – Local environment to be added to the
-training set of the GP.
-
force (np.ndarray) – Force on the central atom of the local
-environment in the form of a 3-component Numpy array
-containing the x, y, and z components.
-
train (bool) – If True, the GP is trained after the local
-environment is added.
Loop through atomic environment objects stored in the training data,
-and re-compute cutoffs for each. Useful if you want to gauge the
-impact of cutoffs given a certain training set! Unless you know
-exactly what you are doing for some development or test purpose,
-it is highly suggested that you call set_L_alpha and
-re-optimize your hyperparameters afterwards as is default here.
-
A helpful way to update the cutoffs and kernel for an extant
-GP is to perform the following commands:
->> hyps_mask = pm.as_dict()
->> hyps = hyps_mask[‘hyps’]
->> cutoffs = hyps_mask[‘cutoffs’]
->> kernels = hyps_mask[‘kernels’]
->> gp_model.update_kernel(kernels, ‘mc’, hyps, cutoffs, hyps_mask)
Runs a series of checks to ensure that the user has not supplied
-contradictory arguments which will result in undefined behavior
-with multiple hyperparameters.
-:return:
Remove force components from the model. Convenience function which
-deletes individual data points.
-
Matrices should always be updated if you intend to use the GP to make
-predictions afterwards. This might be time consuming for large GPs,
-so, it is provided as an option, but, only do so with extreme caution.
-(Undefined behavior may result if you try to make predictions and/or
-add to the training set afterwards).
-
Returns training data which was removed akin to a pop method, in order
-of lowest to highest index passed in.
-
-
Parameters
-
-
indexes – Indexes of envs in training data to remove.
-
update_matrices – If false, will not update the GP’s matrices
-afterwards (which can be time consuming for large models).
-This should essentially always be true except for niche development
-applications.
Invert the covariance matrix, setting L (a lower triangular
-matrix s.t. L L^T = (K + sig_n^2 I)) and alpha, the inverse
-covariance matrix multiplied by the vector of training labels.
-The forces and variances are later obtained using alpha.
Train Gaussian Process model on training data. Tunes the
-hyperparameters to maximize the likelihood, then computes L and alpha
-(related to the covariance matrix of the training set).
-
-
Parameters
-
-
logger (logging.logger) – logger object specifying where to write the
-progress of the optimization.
-
custom_bounds (np.ndarray) – Custom bounds on the hyperparameters.
-
grad_tol (float) – Tolerance of the hyperparameter gradient that
-determines when hyperparameter optimization is terminated.
-
x_tol (float) – Tolerance on the x values used to decide when
-Nelder-Mead hyperparameter optimization is terminated.
-
line_steps (int) – Maximum number of line steps for L-BFGS
-hyperparameter optimization.
-:param logger_name:
-:param print_progress:
Given a structure and forces, add local environments from the
-structure to the training set of the GP. If energy is given, add the
-entire structure to the training set.
-
-
Parameters
-
-
struc (FLARE_Atoms) – Input structure. Local environments of atoms
-in this structure will be added to the training set of the GP.
-
forces (np.ndarray) – Forces on atoms in the structure.
-
custom_range (List[int]) – Indices of atoms whose local
-environments will be added to the training set of the GP.
-
energy (float) – Energy of the structure.
-
stress (np.ndarray) – Stress tensor of the structure. The stress
-tensor components should be given in the following order:
-xx, xy, xz, yy, yz, zz.
Write model in a variety of formats to a file for later re-use.
-JSON files are open to visual inspection and are easier to use
-across different versions of FLARE or GP implementations. However,
-they are larger and loading them in takes longer (by setting up a
-new GP from the specifications). Pickled files can be faster to
-read & write, and they take up less memory.
-
-
Parameters
-
-
name (str) – Output name.
-
format (str) – Output format.
-
split_matrix_size_cutoff (int) – If there are more than this
-
number of training points in the set, save the matrices seperately.
Returns covariances between the local eneregy, force components, and
-partial stresses of a test environment and the total energy labels in the
-training set.
Returns covariances between the local eneregy, force components, and
-partial stresses of a test environment and the force labels in the
-training set.
Compute covariance matrix element between set1 and set2
-:param hyps: list of hyper-parameters
-:param name: name of the gp instance.
-:param same: whether the row and column are the same
-:param kernel: function object of the kernel
-:param cutoffs: The cutoff values used for the atomic environments
-:type cutoffs: list of 2 float numbers
-:param hyps_mask: dictionary used for multi-group hyperparmeters
-:return: covariance matrix
parallel version of get_ky_mat
-:param hyps: list of hyper-parameters
-:param name: name of the gp instance.
-:param kernel: function object of the kernel
-:param cutoffs: The cutoff values used for the atomic environments
-:type cutoffs: list of 2 float numbers
-:param hyps_mask: dictionary used for multi-group hyperparmeters
Compute covariance matrix element between set1 and set2
-:param hyps: list of hyper-parameters
-:param name: name of the gp instance.
-:param same: whether the row and column are the same
-:param kernel: function object of the kernel
-:param cutoffs: The cutoff values used for the atomic environments
-:type cutoffs: list of 2 float numbers
-:param hyps_mask: dictionary used for multi-group hyperparmeters
compute the log likelihood and its gradients
-:param hyps: list of hyper-parameters
-:type hyps: np.ndarray
-:param name: name of the gp instance.
-:param kernel_grad: function object of the kernel gradient
-:param output: Output object for dumping every hyper-parameter
-
-
sets computed
-
-
-
Parameters
-
-
cutoffs (list of 2 float numbers) – The cutoff values used for the atomic environments
-
hyps_mask – dictionary used for multi-group hyperparmeters
-
n_cpus – number of cpus to use.
-
n_sample – the size of block for matrix to compute
parallel version of get_ky_mat
-:param hyps: list of hyper-parameters
-:param name: name of the gp instance.
-:param kernel: function object of the kernel
-:param cutoffs: The cutoff values used for the atomic environments
-:type cutoffs: list of 2 float numbers
-:param hyps_mask: dictionary used for multi-group hyperparmeters
-:return: covariance matrix
Special partition method for the force/energy block. Because the number
-of environments in a structure can vary, we only split up the environment
-list, which has length size1.
-
Note that two sizes need to be specified: the size of the envionment
-list and the size of the structure list.
-
-
Parameters
-
-
n_sample (int) – Number of environments per processor.
partition the training data for matrix calculation
-the number of blocks are close to n_cpus
-since mp.Process does not allow to change the thread number
partition the training data for vector calculation
-the number of blocks are the same as n_cpus
-since mp.Process does not allow to change the thread number
Helper functions which obtain forces and energies
-corresponding to atoms in structures. These functions automatically
-cast atoms into their respective atomic environments.
Return the forces/std. dev. uncertainty associated with an individual atom
-in a structure, without necessarily having cast it to a chemical
-environment. In order to work with other functions,
-all arguments are passed in as a tuple.
-
-
Parameters
-
param (Tuple(FLARE_Atoms, integer, GaussianProcess)) – tuple of FLARE FLARE_Atoms, atom index, and Gaussian Process
-object
-
-
Returns
-
3-element force array and associated uncertainties
Return the forces/std. dev. uncertainty / energy associated with an
-individual atom in a structure, without necessarily having cast it to a
-chemical environment. In order to work with other functions,
-all arguments are passed in as a tuple.
-
-
Parameters
-
param (Tuple(FLARE_Atoms, integer, GaussianProcess)) – tuple of FLARE FLARE_Atoms, atom index, and Gaussian Process
-object
-
-
Returns
-
3-element force array, associated uncertainties, and local energy
Return the forces/std. dev. uncertainty associated with each
-individual atom in a structure. Forces are stored directly to the
-structure and are also returned.
-
-
Parameters
-
-
structure – FLARE structure to obtain forces for, with N atoms
-
gp – Gaussian Process model
-
write_to_structure – Write results to structure’s forces,
-std attributes
-
selective_atoms – Only predict on these atoms; e.g. [0,1,2] will
-only predict and return for those atoms
-
skipped_atom_value – What value to use for atoms that are skipped.
-Defaults to 0 but other options could be e.g. NaN. Will NOT
-write this to the structure if write_to_structure is True.
-
-
-
Returns
-
N x 3 numpy array of foces, Nx3 numpy array of uncertainties
Return the forces/std. dev. uncertainty / local energy associated with each
-individual atom in a structure. Forces are stored directly to the
-structure and are also returned.
-
-
Parameters
-
-
structure – FLARE structure to obtain forces for, with N atoms
-
gp – Gaussian Process model
-
n_cpus – Dummy parameter passed as an argument to allow for
-flexibility when the callable may or may not be parallelized
-
-
-
Returns
-
N x 3 array of forces, N x 3 array of uncertainties,
-N-length array of energies
Return the forces/std. dev. uncertainty associated with each
-individual atom in a structure. Forces are stored directly to the
-structure and are also returned.
-
-
Parameters
-
-
structure – FLARE structure to obtain forces for, with N atoms
-
gp – Gaussian Process model
-
n_cpus – Number of cores to parallelize over
-
write_to_structure – Write results to structure’s forces,
-std attributes
-
selective_atoms – Only predict on these atoms; e.g. [0,1,2] will
-only predict and return for those atoms
-
skipped_atom_value – What value to use for atoms that are skipped.
-Defaults to 0 but other options could be e.g. NaN. Will NOT
-write this to the structure if write_to_structure is True.
-
-
-
Returns
-
N x 3 array of forces, N x 3 array of uncertainties
Return the forces/std. dev. uncertainty / local energy associated with each
-individual atom in a structure, parallelized over atoms. Forces are
-stored directly to the structure and are also returned.
-
-
Parameters
-
-
structure – FLARE structure to obtain forces for, with N atoms
-
gp – Gaussian Process model
-
n_cpus – Number of cores to parallelize over
-
-
-
Returns
-
N x 3 array of forces, N x 3 array of uncertainties,
-N-length array of energies
Build Mapped Gaussian Process (MGP)
-and automatically save coefficients for LAMMPS pair style.
-
-
Parameters
-
-
grid_params (dict) – Parameters for the mapping itself, such as
-grid size of spline fit, etc. As described below.
-
unique_species (dict) – List of all the (unique) species included during
-the training that need to be mapped
-
GP (GaussianProcess) – None or a GaussianProcess object. If a GP is input,
-and container_only is False, automatically build a mapping corresponding
-to the GaussianProcess.
-
var_map (str) – if None: only build mapping for mean (force). If ‘pca’, then
-use PCA to map the variance, based on grid_params[‘xxbody’][‘svd_rank’].
-If ‘simple’, then only map the diagonal of covariance, and predict the
-upper bound of variance. The ‘pca’ mode is much heavier in terms of
-memory, but its prediction is much closer to GP variance.
-
container_only (bool) – if True: only build splines container
-(with no coefficients); if False: Attempt to build map immediately
-
lmp_file_name (str) – LAMMPS coefficient file name
-
n_cpus (int) – Default None. Set to the number of cores needed for
-parallelization. Used in the construction of the map.
-
n_sample (int) – Default 10. The batch size for building map. Not used now.
For grid_params, the following keys and values are allowed
-
-
Parameters
-
-
‘twobody’ (dict, optional) – if 2-body is present, set as a dictionary
-of parameters for 2-body mapping. Parameters see below.
-
‘threebody’ (dict, optional) – if 3-body is present, set as a dictionary
-of parameters for 3-body mapping. Parameters see below.
-
‘load_grid’ (str, optional) – Default None. the path to the directory
-where the previously generated grids (grid_*.npy) are stored.
-If no path is specified, MGP will construct grids from scratch.
-
‘lower_bound_relax’ (float, optional) – Default 0.1. if ‘lower_bound’ is
-set to ‘auto’ this value will be used as a relaxation of lower
-bound. (see below the description of ‘lower_bound’)
-
-
-
-
For two/three body parameter dictionary, the following keys and values are allowed
-
-
Parameters
-
-
‘grid_num’ (list) – a list of integers, the number of grid points for
-interpolation. The larger the number, the better the approximation
-of MGP is compared with GP.
-
‘lower_bound’ (str or list, optional) – Default ‘auto’, the lower bound
-of the spline interpolation will be searched. First, search the
-training set of GP and find the minimal interatomic distance r_min.
-Then, the lower_bound=r_min-lower_bound_relax. The user
-can set their own lower_bound, of the same shape as ‘grid_num’.
-E.g. for threebody, the customized lower bound can be set as
-[1.2, 1.2, 1.2].
-
‘upper_bound’ (str or list, optional) – Default ‘auto’, the upper bound
-of the spline interpolation will be the cutoffs of GP. The user
-can set their own upper_bound, of the same shape as ‘grid_num’.
-E.g. for threebody, the customized lower bound can be set as
-[3.5, 3.5, 3.5].
-
‘svd_rank’ (int, optional) – Default ‘auto’. If the variance mapping is
-needed, it is set as the rank of the mapping. ‘auto’ uses full
-rank, which is the smaller one between the total number of grid
-points and training set size. i.e.
-full_rank=min(np.prod(grid_num),3*N_train)
predict force, variance, stress and local energy for given
-atomic environment
-
-
Parameters
-
atom_env – atomic environment (with a center atom and its neighbors)
-
-
Returns
-
3d array of atomic force
-variance: 3d array of the predictive variance
-stress: 6d array of the virial stress
-energy: the local energy (atomic energy)
Forked from Github repository: https://github.com/EconForge/interpolation.py. High-level API for cubic splines. Class representing a cubic spline interpolator on a regular cartesian grid.
-
Creates a cubic spline interpolator on a regular cartesian grid.
-
-
Parameters
-
-
a (numpy array of size d (float)) – Lower bounds of the cartesian grid.
-
b (numpy array of size d (float)) – Upper bounds of the cartesian grid.
-
orders (numpy array of size d (int)) – Number of nodes along each dimension (=(n1,…,nd) )
-
-
-
Other Parameters
-
values (numpy array (float)) – (optional, (n1 x … x nd) array). Values on the nodes of the function to interpolate.
Build splines for PCA decomposition, mainly used for the mapping of the variance
-
-
Parameters
-
-
l_bounds (numpy array) – lower bound for the interpolation. E.g. 1-d for two-body, 3-d for three-body.
-
u_bounds (numpy array) – upper bound for the interpolation.
-
orders (numpy array) – grid numbers in each dimension. E.g, 1-d for two-body, 3-d for three-body, should be positive integers.
-
svd_rank (int) – rank for decomposition of variance matrix, also equal to the number of mappings constructed for mapping variance. For two-body svd_rank<=min(grid_num, train_size*3), for three-body svd_rank<=min(grid_num_in_cube, train_size*3)
The AtomicEnvironment object stores information about the local
-environment of an atom. AtomicEnvironment objects are inputs to the
-2-, 3-, and 2+3-body kernels.
Contains information about the local environment of an atom,
-including arrays of pair and triplet distances and the chemical
-species of atoms in the environment.
-
-
Parameters
-
-
structure (FLARE_Atoms) – structure of atoms.
-
atom (int) – Index of the atom in the structure.
-
cutoffs (np.ndarray) – 2- and 3-body cutoff radii. 2-body if one cutoff is
-given, 2+3-body if two are passed.
-
cutoffs_mask (dict) – a dictionary to store multiple cutoffs if neede
-it should be exactly the same as the hyps mask
-
-
-
-
The cutoffs_mask allows the user to define multiple cutoffs for different
-bonds, triples, and many body interaction. This dictionary should be
-consistent with the hyps_mask used in the GuassianProcess object.
-
-
-
species_mask: 118-long integer array descirbing which elements belong to
like groups for determining which bond hyperparameters to use.
-For instance, [0,0,1,1,0 …] assigns H to group 0, He and
-Li to group 1, and Be to group 0 (the 0th register is ignored).
-
-
-
-
-
nspecie: Integer, number of different species groups (equal to number of
unique values in species_mask).
-
-
-
-
-
ntwobody: Integer, number of different hyperparameter/cutoff sets to
associate with different 2-body pairings of atoms in groups defined in
-species_mask.
-
-
-
-
-
twobody_mask: Array of length nspecie^2, which describes the cutoff to
associate with different pairings of species types. For example, if
-there are atoms of type 0 and 1, then twobody_mask defines which cutoff
-to use for parings [0-0, 0-1, 1-0, 1-1]: if we wanted cutoff0 for
-0-0 parings and set 1 for 0-1 and 1-1 pairings, then we would make
-twobody_mask [0, 1, 1, 1].
-
-
-
-
-
twobody_cutoff_list: Array of length ntwobody, which stores the cutoff
used for different types of bonds defined in twobody_mask
-
-
-
-
-
ncut3b: Integer, number of different cutoffs sets to associate
with different 3-body pariings of atoms in groups defined in
-species_mask.
-
-
-
-
-
cut3b_mask: Array of length nspecie^2, which describes the cutoff to
associate with different bond types in triplets. For example, in a
-triplet (C, O, H) , there are three cutoffs. Cutoffs for CH bond, CO
-bond and OH bond. If C and O are associate with atom group 1 in
-species_mask and H are associate with group 0 in species_mask, the
-cut3b_mask[1*nspecie+0] determines the C/O-H bond cutoff, and
-cut3b_mask[1*nspecie+1] determines the C-O bond cutoff. If we want the
-former one to use the 1st cutoff in threebody_cutoff_list and the later
-to use the 2nd cutoff in threebody_cutoff_list, the cut3b_mask should
-be [0, 0, 0, 1].
-
-
-
-
-
threebody_cutoff_list: Array of length ncut3b, which stores the cutoff
used for different types of bonds in triplets.
-
-
-
-
-
nmanybody: Integer, number of different cutoffs set to associate with
different coordination numbers.
-
-
-
-
manybody_mask: Similar to twobody_mask and cut3b_mask.
-
-
manybody_cutoff_list: Array of length nmanybody, stores the cutoff used
for different many body terms
-
-
-
-
-
Examples can be found at the end of in tests/test_env.py
Returns Atomic Environment object as a dictionary for serialization
-purposes. Optional to not include the structure to avoid redundant
-information.
-:return:
Build GP model from the training frames parsed from the log file.
-The cell, hyps and gp can be reset with customized values.
-
-
Parameters
-
-
cell (np.ndarray) – Default None to use the cell from the log file.
-A customized cell can be input as a 3x3 numpy array.
-
call_no (int) – Default None to use all the DFT frames as training
-data for building GP. If not None, then the frames 0 to call_no
-will be added to GP.
-
hyps (np.ndarray) – Default None to use the hyperparameters from the
-log file. Customized hyps can be input as an array.
-
init_gp (GaussianProcess) – Default to None to use no initial settings
-or training data. an initial GP can be used, and then the
-frames parsed in the log file will add to the initial GP. Then the
-final GP uses the hyps and kernels of init_gp, and consists of
-training data from init_gp and the data from the log file.
-NOTE: if a log file from restarted OTF is parsed, then an initial
-GP needs to be parsed from the prior log file as the init_gp of the
-restarted log file.
-
hyp_no (int) – Default None to use the final optimized hyperparameters to
-build GP. If not None, then use the hyps from the `hyp_no`th
-optimization step.
-
kwargs – if a new GP setting is needed without inputing init_gp, the GP
-initial args can be input as kwargs.
parse a line in otf output.
-:param frame_line: frame line to be parsed
-:type frame_line: string
-:return: species, position, force, uncertainty, and velocity of atom
-:rtype: list, np.arrays
Class which contains various methods to print the output of different
-ways of using FLARE, such as training a GP from an AIMD run,
-or running an MD simulation updated on-the-fly.
This is an I/O class that hosts the log files for OTF and Trajectories
-class. It is also used in get_neg_like_grad and get_neg_likelihood in
-gp_algebra to print intermediate results.
-
It opens and print files with the basename prefix and different
-suffixes corresponding to different kinds of output data.
-
-
Parameters
-
-
basename (str, optional) – Base output file name, suffixes will be added
-
verbose (str, optional) – print level. The same as logging level. It can be
-CRITICAL, ERROR, WARNING, INFO, DEBUG, NOTSET
A cosine cutoff that returns 1 up to r_cut - d, and assigns a cosine
-envelope to values of r between r_cut - d and r_cut. Based on Eq. 24 of
-Albert P. Bartók and Gábor Csányi. “Gaussian approximation potentials: A
-brief tutorial introduction.” International Journal of Quantum Chemistry
-115.16 (2015): 1051-1057.
-
-
Parameters
-
-
r_cut (float) – Cutoff value (in angstrom).
-
ri (float) – Interatomic distance.
-
ci (float) – Cartesian coordinate divided by the distance.
Pairwise contribution to many-body descriptor based on number of
atoms in the environment
-
-
-
-
Parameters
-
-
rij (float) – distance between atoms i and j
-
cij (float) – Component of versor of rij along given direction
-
r_cut (float) – cutoff hyperparameter
-
cutoff_func (callable) – cutoff function
-
-
-
Returns
-
the value of the pairwise many-body contribution
-float: the value of the derivative of the pairwise many-body
-contribution w.r.t. the central atom displacement
Multicomponent kernels (simple) restrict all signal variance and length scale of hyperparameters
-to a single value. The kernels in this module allow you to have different sets of hyperparameters
-and cutoffs for different interactions, and have flexible groupings of elements. It also allows
-you to do partial hyper-parameter training, keeping some components fixed.
-
To use this set of kernels, we need a hyps_mask dictionary for GaussianProcess, MappedGaussianProcess,
-and AtomicEnvironment (if you also set up different cutoffs). A simple example is shown below.
In the example above, Parameters class generates the arrays needed
-for these kernels and store all the grouping and mapping information in the
-hyps_mask dictionary. It stores following keys and values:
-
-
-
spec_mask: 118-long integer array descirbing which elements belong to
like groups for determining which bond hyperparameters to use. For
-instance, [0,0,1,1,0 …] assigns H to group 0, He and Li to group 1,
-and Be to group 0 (the 0th register is ignored).
-
-
-
-
-
nspec: Integer, number of different species groups (equal to number of
unique values in spec_mask).
-
-
-
-
-
nbond: Integer, number of different hyperparameter sets to associate with
different 2-body pairings of atoms in groups defined in spec_mask.
-
-
-
-
-
bond_mask: Array of length nspec^2, which describes the hyperparameter sets to
associate with different pairings of species types. For example, if there
-are atoms of type 0 and 1, then bond_mask defines which hyperparameters
-to use for parings [0-0, 0-1, 1-0, 1-1]: if we wanted hyperparameter set 0 for
-0-0 parings and set 1 for 0-1 and 1-1 pairings, then we would make
-bond_mask [0, 1, 1, 1].
-
-
-
-
-
ntriplet: Integer, number of different hyperparameter sets to associate
with different 3-body pariings of atoms in groups defined in spec_mask.
-
-
-
-
-
triplet_mask: Similar to bond mask: Triplet pairings of type 0 and 1 atoms
would go {0-0-0, 0-0-1, 0-1-0, 0-1-1, 1-0-0, 1-0-1, 1-1-0, 1-1-1},
-and if we wanted hyp. set 0 for triplets with only atoms of type 0
-and hyp. set 1 for all the rest, then the triplet_mask array would
-read [0,1,1,1,1,1,1,1]. The user should make sure that the mask has
-a permutational symmetry.
-
-
-
-
-
cutoff_2b: Array of length nbond, which stores the cutoff used for different
types of bonds defined in bond_mask
-
-
-
-
-
ncut3b: Integer, number of different cutoffs sets to associate
with different 3-body pariings of atoms in groups defined in spec_mask.
-
-
-
-
-
cut3b_mask: Array of length nspec^2, which describes the cutoff to
associate with different bond types in triplets. For example, in a triplet
-(C, O, H) , there are three cutoffs. Cutoffs for CH bond, CO bond and OH bond.
-If C and O are associate with atom group 1 in spec_mask and H are associate with
-group 0 in spec_mask, the cut3b_mask[1*nspec+0] determines the C/O-H bond cutoff,
-and cut3b_mask[1*nspec+1] determines the C-O bond cutoff. If we want the
-former one to use the 1st cutoff in cutoff_3b and the later to use the 2nd cutoff
-in cutoff_3b, the cut3b_mask should be [0, 0, 0, 1]
-
-
-
-
-
cutoff_3b: Array of length ncut3b, which stores the cutoff used for different
types of bonds in triplets.
-
-
-
-
-
nmbInteger, number of different cutoffs set to associate with different coordination
numbers
-
-
-
-
mb_mask: similar to bond_mask and cut3b_mask.
-
cutoff_mb: Array of length nmb, stores the cutoff used for different many body terms
-
-
For selective optimization. one can define ‘map’, ‘train_noise’ and ‘original’
-to identify which element to be optimized. All three have to be defined.
-train_noise = Bool (True/False), whether the noise parameter can be optimized
-original: np.array. Full set of initial values for hyperparmeters
-map: np.array, array to map the hyper parameter back to the full set.
-map[i]=j means the i-th element in hyps should be the j-th element in
-hyps_mask[‘original’]
-
For example, the full set of hyper parmeters
-may include [ls21, ls22, sig21, sig22, ls3
-sg3, noise] but suppose you wanted only the set 21 optimized.
-The full set of hyperparameters is defined in ‘original’; include all those
-you want to leave static, and set initial guesses for those you want to vary.
-Have the ‘map’ list contain the indices of the hyperparameters in ‘original’
-that correspond to the hyperparameters you want to vary.
-Have a hyps list which contain those which you want to vary. Below,
-ls21, ls22 etc… represent floating-point variables which correspond
-to the initial guesses / static values.
-You would then pass in:
the hyps argument should only contain the values that need to be optimized.
-If you want noise to be trained as well include noise as the
-final hyperparameter value in hyps.
Multicomponent two-body force/force kernel accelerated with Numba’s
-njit decorator.
-Loops over bonds in two environments and adds to the kernel if bonds are
-of the same type.
3-body multi-element kernel between two local energies accelerated
-with Numba.
-
-
Parameters
-
-
bond_array_1 (np.ndarray) – 3-body bond array of the first local
-environment.
-
c1 (int) – Species of the central atom of the first local environment.
-
etypes1 (np.ndarray) – Species of atoms in the first local
-environment.
-
bond_array_2 (np.ndarray) – 3-body bond array of the second local
-environment.
-
c2 (int) – Species of the central atom of the second local environment.
-
etypes2 (np.ndarray) – Species of atoms in the second local
-environment.
-
cross_bond_inds_1 (np.ndarray) – Two dimensional array whose row m
-contains the indices of atoms n > m in the first local
-environment that are within a distance r_cut of both atom n and
-the central atom.
-
cross_bond_inds_2 (np.ndarray) – Two dimensional array whose row m
-contains the indices of atoms n > m in the second local
-environment that are within a distance r_cut of both atom n and
-the central atom.
-
cross_bond_dists_1 (np.ndarray) – Two dimensional array whose row m
-contains the distances from atom m of atoms n > m in the first
-local environment that are within a distance r_cut of both atom
-n and the central atom.
-
cross_bond_dists_2 (np.ndarray) – Two dimensional array whose row m
-contains the distances from atom m of atoms n > m in the second
-local environment that are within a distance r_cut of both atom
-n and the central atom.
-
triplets_1 (np.ndarray) – One dimensional array of integers whose entry
-m is the number of atoms in the first local environment that are
-within a distance r_cut of atom m.
-
triplets_2 (np.ndarray) – One dimensional array of integers whose entry
-m is the number of atoms in the second local environment that are
-within a distance r_cut of atom m.
-
sig (float) – 3-body signal variance hyperparameter.
3-body multi-element kernel between a force component and a local
-energy accelerated with Numba.
-
-
Parameters
-
-
bond_array_1 (np.ndarray) – 3-body bond array of the first local
-environment.
-
c1 (int) – Species of the central atom of the first local environment.
-
etypes1 (np.ndarray) – Species of atoms in the first local
-environment.
-
bond_array_2 (np.ndarray) – 3-body bond array of the second local
-environment.
-
c2 (int) – Species of the central atom of the second local environment.
-
etypes2 (np.ndarray) – Species of atoms in the second local
-environment.
-
cross_bond_inds_1 (np.ndarray) – Two dimensional array whose row m
-contains the indices of atoms n > m in the first local
-environment that are within a distance r_cut of both atom n and
-the central atom.
-
cross_bond_inds_2 (np.ndarray) – Two dimensional array whose row m
-contains the indices of atoms n > m in the second local
-environment that are within a distance r_cut of both atom n and
-the central atom.
-
cross_bond_dists_1 (np.ndarray) – Two dimensional array whose row m
-contains the distances from atom m of atoms n > m in the first
-local environment that are within a distance r_cut of both atom
-n and the central atom.
-
cross_bond_dists_2 (np.ndarray) – Two dimensional array whose row m
-contains the distances from atom m of atoms n > m in the second
-local environment that are within a distance r_cut of both atom
-n and the central atom.
-
triplets_1 (np.ndarray) – One dimensional array of integers whose entry
-m is the number of atoms in the first local environment that are
-within a distance r_cut of atom m.
-
triplets_2 (np.ndarray) – One dimensional array of integers whose entry
-m is the number of atoms in the second local environment that are
-within a distance r_cut of atom m.
-
d1 (int) – Force component of the first environment (1=x, 2=y, 3=z).
-
sig (float) – 3-body signal variance hyperparameter.
3-body multi-element kernel between two force components and its
-gradient with respect to the hyperparameters.
-
-
Parameters
-
-
bond_array_1 (np.ndarray) – 3-body bond array of the first local
-environment.
-
c1 (int) – Species of the central atom of the first local environment.
-
etypes1 (np.ndarray) – Species of atoms in the first local
-environment.
-
bond_array_2 (np.ndarray) – 3-body bond array of the second local
-environment.
-
c2 (int) – Species of the central atom of the second local environment.
-
etypes2 (np.ndarray) – Species of atoms in the second local
-environment.
-
cross_bond_inds_1 (np.ndarray) – Two dimensional array whose row m
-contains the indices of atoms n > m in the first local
-environment that are within a distance r_cut of both atom n and
-the central atom.
-
cross_bond_inds_2 (np.ndarray) – Two dimensional array whose row m
-contains the indices of atoms n > m in the second local
-environment that are within a distance r_cut of both atom n and
-the central atom.
-
cross_bond_dists_1 (np.ndarray) – Two dimensional array whose row m
-contains the distances from atom m of atoms n > m in the first
-local environment that are within a distance r_cut of both atom
-n and the central atom.
-
cross_bond_dists_2 (np.ndarray) – Two dimensional array whose row m
-contains the distances from atom m of atoms n > m in the second
-local environment that are within a distance r_cut of both atom
-n and the central atom.
-
triplets_1 (np.ndarray) – One dimensional array of integers whose entry
-m is the number of atoms in the first local environment that are
-within a distance r_cut of atom m.
-
triplets_2 (np.ndarray) – One dimensional array of integers whose entry
-m is the number of atoms in the second local environment that are
-within a distance r_cut of atom m.
-
d1 (int) – Force component of the first environment.
-
d2 (int) – Force component of the second environment.
-
sig (float) – 3-body signal variance hyperparameter.
-
ls (float) – 3-body length scale hyperparameter.
-
r_cut (float) – 3-body cutoff radius.
-
cutoff_func (Callable) – Cutoff function.
-
-
-
Returns
-
Value of the 3-body kernel and its gradient with respect to the
-hyperparameters.
3-body multi-element kernel between two force components accelerated
-with Numba.
-
-
Parameters
-
-
bond_array_1 (np.ndarray) – 3-body bond array of the first local
-environment.
-
c1 (int) – Species of the central atom of the first local environment.
-
etypes1 (np.ndarray) – Species of atoms in the first local
-environment.
-
bond_array_2 (np.ndarray) – 3-body bond array of the second local
-environment.
-
c2 (int) – Species of the central atom of the second local environment.
-
etypes2 (np.ndarray) – Species of atoms in the second local
-environment.
-
cross_bond_inds_1 (np.ndarray) – Two dimensional array whose row m
-contains the indices of atoms n > m in the first local
-environment that are within a distance r_cut of both atom n and
-the central atom.
-
cross_bond_inds_2 (np.ndarray) – Two dimensional array whose row m
-contains the indices of atoms n > m in the second local
-environment that are within a distance r_cut of both atom n and
-the central atom.
-
cross_bond_dists_1 (np.ndarray) – Two dimensional array whose row m
-contains the distances from atom m of atoms n > m in the first
-local environment that are within a distance r_cut of both atom
-n and the central atom.
-
cross_bond_dists_2 (np.ndarray) – Two dimensional array whose row m
-contains the distances from atom m of atoms n > m in the second
-local environment that are within a distance r_cut of both atom
-n and the central atom.
-
triplets_1 (np.ndarray) – One dimensional array of integers whose entry
-m is the number of atoms in the first local environment that are
-within a distance r_cut of atom m.
-
triplets_2 (np.ndarray) – One dimensional array of integers whose entry
-m is the number of atoms in the second local environment that are
-within a distance r_cut of atom m.
-
d1 (int) – Force component of the first environment.
-
d2 (int) – Force component of the second environment.
-
sig (float) – 3-body signal variance hyperparameter.
3-body multi-element kernel between a force component and a local
-energy accelerated with Numba.
-
-
Parameters
-
-
bond_array_1 (np.ndarray) – 3-body bond array of the first local
-environment.
-
c1 (int) – Species of the central atom of the first local environment.
-
etypes1 (np.ndarray) – Species of atoms in the first local
-environment.
-
bond_array_2 (np.ndarray) – 3-body bond array of the second local
-environment.
-
c2 (int) – Species of the central atom of the second local environment.
-
etypes2 (np.ndarray) – Species of atoms in the second local
-environment.
-
cross_bond_inds_1 (np.ndarray) – Two dimensional array whose row m
-contains the indices of atoms n > m in the first local
-environment that are within a distance r_cut of both atom n and
-the central atom.
-
cross_bond_inds_2 (np.ndarray) – Two dimensional array whose row m
-contains the indices of atoms n > m in the second local
-environment that are within a distance r_cut of both atom n and
-the central atom.
-
cross_bond_dists_1 (np.ndarray) – Two dimensional array whose row m
-contains the distances from atom m of atoms n > m in the first
-local environment that are within a distance r_cut of both atom
-n and the central atom.
-
cross_bond_dists_2 (np.ndarray) – Two dimensional array whose row m
-contains the distances from atom m of atoms n > m in the second
-local environment that are within a distance r_cut of both atom
-n and the central atom.
-
triplets_1 (np.ndarray) – One dimensional array of integers whose entry
-m is the number of atoms in the first local environment that are
-within a distance r_cut of atom m.
-
triplets_2 (np.ndarray) – One dimensional array of integers whose entry
-m is the number of atoms in the second local environment that are
-within a distance r_cut of atom m.
-
sig (float) – 3-body signal variance hyperparameter.
3-body multi-element kernel between two force components accelerated
-with Numba.
-
-
Parameters
-
-
bond_array_1 (np.ndarray) – 3-body bond array of the first local
-environment.
-
c1 (int) – Species of the central atom of the first local environment.
-
etypes1 (np.ndarray) – Species of atoms in the first local
-environment.
-
bond_array_2 (np.ndarray) – 3-body bond array of the second local
-environment.
-
c2 (int) – Species of the central atom of the second local environment.
-
etypes2 (np.ndarray) – Species of atoms in the second local
-environment.
-
cross_bond_inds_1 (np.ndarray) – Two dimensional array whose row m
-contains the indices of atoms n > m in the first local
-environment that are within a distance r_cut of both atom n and
-the central atom.
-
cross_bond_inds_2 (np.ndarray) – Two dimensional array whose row m
-contains the indices of atoms n > m in the second local
-environment that are within a distance r_cut of both atom n and
-the central atom.
-
cross_bond_dists_1 (np.ndarray) – Two dimensional array whose row m
-contains the distances from atom m of atoms n > m in the first
-local environment that are within a distance r_cut of both atom
-n and the central atom.
-
cross_bond_dists_2 (np.ndarray) – Two dimensional array whose row m
-contains the distances from atom m of atoms n > m in the second
-local environment that are within a distance r_cut of both atom
-n and the central atom.
-
triplets_1 (np.ndarray) – One dimensional array of integers whose entry
-m is the number of atoms in the first local environment that are
-within a distance r_cut of atom m.
-
triplets_2 (np.ndarray) – One dimensional array of integers whose entry
-m is the number of atoms in the second local environment that are
-within a distance r_cut of atom m.
-
sig (float) – 3-body signal variance hyperparameter.
3-body multi-element kernel between two force components accelerated
-with Numba.
-
-
Parameters
-
-
bond_array_1 (np.ndarray) – 3-body bond array of the first local
-environment.
-
c1 (int) – Species of the central atom of the first local environment.
-
etypes1 (np.ndarray) – Species of atoms in the first local
-environment.
-
bond_array_2 (np.ndarray) – 3-body bond array of the second local
-environment.
-
c2 (int) – Species of the central atom of the second local environment.
-
etypes2 (np.ndarray) – Species of atoms in the second local
-environment.
-
cross_bond_inds_1 (np.ndarray) – Two dimensional array whose row m
-contains the indices of atoms n > m in the first local
-environment that are within a distance r_cut of both atom n and
-the central atom.
-
cross_bond_inds_2 (np.ndarray) – Two dimensional array whose row m
-contains the indices of atoms n > m in the second local
-environment that are within a distance r_cut of both atom n and
-the central atom.
-
cross_bond_dists_1 (np.ndarray) – Two dimensional array whose row m
-contains the distances from atom m of atoms n > m in the first
-local environment that are within a distance r_cut of both atom
-n and the central atom.
-
cross_bond_dists_2 (np.ndarray) – Two dimensional array whose row m
-contains the distances from atom m of atoms n > m in the second
-local environment that are within a distance r_cut of both atom
-n and the central atom.
-
triplets_1 (np.ndarray) – One dimensional array of integers whose entry
-m is the number of atoms in the first local environment that are
-within a distance r_cut of atom m.
-
triplets_2 (np.ndarray) – One dimensional array of integers whose entry
-m is the number of atoms in the second local environment that are
-within a distance r_cut of atom m.
-
sig (float) – 3-body signal variance hyperparameter.
3-body single-element kernel between two local energies accelerated
-with Numba.
-
-
Parameters
-
-
bond_array_1 (np.ndarray) – 3-body bond array of the first local
-environment.
-
bond_array_2 (np.ndarray) – 3-body bond array of the second local
-environment.
-
cross_bond_inds_1 (np.ndarray) – Two dimensional array whose row m
-contains the indices of atoms n > m in the first local
-environment that are within a distance r_cut of both atom n and
-the central atom.
-
cross_bond_inds_2 (np.ndarray) – Two dimensional array whose row m
-contains the indices of atoms n > m in the second local
-environment that are within a distance r_cut of both atom n and
-the central atom.
-
cross_bond_dists_1 (np.ndarray) – Two dimensional array whose row m
-contains the distances from atom m of atoms n > m in the first
-local environment that are within a distance r_cut of both atom
-n and the central atom.
-
cross_bond_dists_2 (np.ndarray) – Two dimensional array whose row m
-contains the distances from atom m of atoms n > m in the second
-local environment that are within a distance r_cut of both atom
-n and the central atom.
-
triplets_1 (np.ndarray) – One dimensional array of integers whose entry
-m is the number of atoms in the first local environment that are
-within a distance r_cut of atom m.
-
triplets_2 (np.ndarray) – One dimensional array of integers whose entry
-m is the number of atoms in the second local environment that are
-within a distance r_cut of atom m.
-
sig (float) – 3-body signal variance hyperparameter.
3-body single-element kernel between a force component and a local
-energy accelerated with Numba.
-
-
Parameters
-
-
bond_array_1 (np.ndarray) – 3-body bond array of the first local
-environment.
-
bond_array_2 (np.ndarray) – 3-body bond array of the second local
-environment.
-
cross_bond_inds_1 (np.ndarray) – Two dimensional array whose row m
-contains the indices of atoms n > m in the first local
-environment that are within a distance r_cut of both atom n and
-the central atom.
-
cross_bond_inds_2 (np.ndarray) – Two dimensional array whose row m
-contains the indices of atoms n > m in the second local
-environment that are within a distance r_cut of both atom n and
-the central atom.
-
cross_bond_dists_1 (np.ndarray) – Two dimensional array whose row m
-contains the distances from atom m of atoms n > m in the first
-local environment that are within a distance r_cut of both atom
-n and the central atom.
-
cross_bond_dists_2 (np.ndarray) – Two dimensional array whose row m
-contains the distances from atom m of atoms n > m in the second
-local environment that are within a distance r_cut of both atom
-n and the central atom.
-
triplets_1 (np.ndarray) – One dimensional array of integers whose entry
-m is the number of atoms in the first local environment that are
-within a distance r_cut of atom m.
-
triplets_2 (np.ndarray) – One dimensional array of integers whose entry
-m is the number of atoms in the second local environment that are
-within a distance r_cut of atom m.
-
d1 (int) – Force component of the first environment (1=x, 2=y, 3=z).
-
sig (float) – 3-body signal variance hyperparameter.
3-body single-element kernel between two force components and its
-gradient with respect to the hyperparameters.
-
-
Parameters
-
-
bond_array_1 (np.ndarray) – 3-body bond array of the first local
-environment.
-
bond_array_2 (np.ndarray) – 3-body bond array of the second local
-environment.
-
cross_bond_inds_1 (np.ndarray) – Two dimensional array whose row m
-contains the indices of atoms n > m in the first local
-environment that are within a distance r_cut of both atom n and
-the central atom.
-
cross_bond_inds_2 (np.ndarray) – Two dimensional array whose row m
-contains the indices of atoms n > m in the second local
-environment that are within a distance r_cut of both atom n and
-the central atom.
-
cross_bond_dists_1 (np.ndarray) – Two dimensional array whose row m
-contains the distances from atom m of atoms n > m in the first
-local environment that are within a distance r_cut of both atom
-n and the central atom.
-
cross_bond_dists_2 (np.ndarray) – Two dimensional array whose row m
-contains the distances from atom m of atoms n > m in the second
-local environment that are within a distance r_cut of both atom
-n and the central atom.
-
triplets_1 (np.ndarray) – One dimensional array of integers whose entry
-m is the number of atoms in the first local environment that are
-within a distance r_cut of atom m.
-
triplets_2 (np.ndarray) – One dimensional array of integers whose entry
-m is the number of atoms in the second local environment that are
-within a distance r_cut of atom m.
-
d1 (int) – Force component of the first environment.
-
d2 (int) – Force component of the second environment.
-
sig (float) – 3-body signal variance hyperparameter.
-
ls (float) – 3-body length scale hyperparameter.
-
r_cut (float) – 3-body cutoff radius.
-
cutoff_func (Callable) – Cutoff function.
-
-
-
Returns
-
Value of the 3-body kernel and its gradient with respect to the
-hyperparameters.
3-body single-element kernel between two force components accelerated
-with Numba.
-
-
Parameters
-
-
bond_array_1 (np.ndarray) – 3-body bond array of the first local
-environment.
-
bond_array_2 (np.ndarray) – 3-body bond array of the second local
-environment.
-
cross_bond_inds_1 (np.ndarray) – Two dimensional array whose row m
-contains the indices of atoms n > m in the first local
-environment that are within a distance r_cut of both atom n and
-the central atom.
-
cross_bond_inds_2 (np.ndarray) – Two dimensional array whose row m
-contains the indices of atoms n > m in the second local
-environment that are within a distance r_cut of both atom n and
-the central atom.
-
cross_bond_dists_1 (np.ndarray) – Two dimensional array whose row m
-contains the distances from atom m of atoms n > m in the first
-local environment that are within a distance r_cut of both atom
-n and the central atom.
-
cross_bond_dists_2 (np.ndarray) – Two dimensional array whose row m
-contains the distances from atom m of atoms n > m in the second
-local environment that are within a distance r_cut of both atom
-n and the central atom.
-
triplets_1 (np.ndarray) – One dimensional array of integers whose entry
-m is the number of atoms in the first local environment that are
-within a distance r_cut of atom m.
-
triplets_2 (np.ndarray) – One dimensional array of integers whose entry
-m is the number of atoms in the second local environment that are
-within a distance r_cut of atom m.
-
d1 (int) – Force component of the first environment.
-
d2 (int) – Force component of the second environment.
-
sig (float) – 3-body signal variance hyperparameter.
Tool to enable the development of a GP model based on an AIMD
-trajectory with many customizable options for fine control of training.
-Contains methods to transfer the model to an OTF run or MD engine run.
The various parameters in the TrajectoryTrainer class related to
-“Seed frames” are to help you train a model which does not yet have a
-training set. Uncertainty- and force-error driven training will go better with
-a somewhat populated training set, as force and uncertainty estimates
-are better behaveed with more data.
-
You may pass in a set of seed frames or atomic environments.
-All seed environments will be added to the GP model; seed frames will
-be iterated through and atoms will be added at random.
-There are a few reasons why you would want to pay special attention to an
-individual species.
-
If you are studying a system where the dynamics of one species are
-particularly important and so you want a good representation in the training
-set, then you would want to include as many as possible in the training set
-during the seed part of the training.
-
Inversely, if a system has high representation of a species well-described
-by a simple 2+3 body kernel, you may want it to be less well represented
-in the seeded training set.
-
By specifying the pre_train_atoms_per_element, you can limit the number of
-atoms of a given species which are added in. You can also limit the number
-of atoms which are added from a given seed frame.
Reads output of a TrajectoryTrainer run by frame. return_gp_data returns
-data about GP model growth useful for visualizing progress of model
-training.
-
-
Parameters
-
-
file – filename of output
-
return_gp_data – flag for returning extra GP data
-
compute_errors – Compute deviation from GP and DFT forces.
-
-
-
Returns
-
List of dictionaries with keys ‘species’, ‘positions’,
-‘gp_forces’, ‘dft_forces’, ‘gp_stds’, ‘added_atoms’, and
-‘maes_by_species’, optionally, gp_data dictionary
Takes as input the first output from the parse_trajectory_trainer_output
-function and turns it into a series of FLARE structures, with DFT forces mapped
-onto the structures.
-
-
Parameters
-
frame_dictionaries – The list of dictionaries which describe each GPFA frame.
descriptors – A list of descriptor objects, or a single descriptor (most common), e.g. B2.
-
rcut – The interaction cut-off radius.
-
type2number – The atomic numbers of all LAMMPS types.
-
dftcalc – An ASE calculator, e.g. Espresso.
-
energy_correction – Per-type corrections to the DFT potential energy.
-
dft_call_threshold – Uncertainty threshold for whether to call DFT.
-
dft_add_threshold – Uncertainty threshold for whether to add an atom to the training set.
-
dft_xyz_fname – Name of the file in which to save the DFT results.
-Should contain ‘*’, which will be replaced with the current step.
-
std_xyz_fname – Name of the file in which to save ASE Atoms with per-atom uncertainties as charges.
-Should contain ‘*’, which will be replaced with the current step.
-
model_fname – Name of the saved model, must correspond to pair_coeff.
-
hyperparameter_optimization – Boolean function that determines whether to run hyperparameter optimization, as a function of this LMPOTF
-object, the LAMMPS instance and the current step.
-
opt_bounds – Bounds for the hyperparameter optimization.
-
opt_method – Algorithm for the hyperparameter optimization.
-
opt_iterations – Max number of iterations for the hyperparameter optimization.
-
post_dft_callback – A function that is called after every DFT call. Receives this LMPOTF object and the current step.
-
wandb – The wandb object, which should already be initialized.
-
log_fname – An output file to which logging info is written.
rescale_steps (List[int], optional) – List of frames for which the
-velocities of the atoms are rescaled. Defaults to [].
-
rescale_temps (List[int], optional) – List of rescaled temperatures.
-Defaults to [].
-
write_model (int, optional) – If 0, write never. If 1, write at
-end of run. If 2, write after each training and end of run.
-If 3, write after each time atoms are added and end of run.
-If 4, write after each training and end of run, and back up
-after each write.
-
force_only (bool, optional) – If True, only use forces for training.
-Default to False, use forces, energy and stress for training.
-
std_tolerance_factor (float, optional) – Threshold that determines
-when DFT is called. Specifies a multiple of the current noise
-hyperparameter. If the epistemic uncertainty on a force
-component exceeds this value, DFT is called. Defaults to 1.
-
skip (int, optional) – Number of frames that are skipped when
-dumping to the output file. Defaults to 0.
-
init_atoms (List[int], optional) – List of atoms from the input
-structure whose local environments and force components are
-used to train the initial GP model. If None is specified, all
-atoms are used to train the initial GP. Defaults to None.
-
output_name (str, optional) – Name of the output file. Defaults to
-‘otf_run’.
-
max_atoms_added (int, optional) – Number of atoms added each time
-DFT is called. Defaults to 1.
-
train_hyps (tuple, optional) – Specifies the range of steps the
-hyperparameters of the GP are optimized. If the number of DFT
-calls is in this range, the hyperparameters are frozen.
-Defaults to (None, None) which means always training.
-
min_steps_with_model (int, optional) – Minimum number of steps the
-model takes in between calls to DFT. Defaults to 0.
-
dft_kwargs ([type], optional) – Additional arguments which are
-passed when DFT is called; keyword arguments vary based on the
-program (e.g. ESPRESSO vs. VASP). Defaults to None.
-
store_dft_output (Tuple[Union[str,List[str]],str], optional) – After DFT calculations are called, copy the file or files
-specified in the first element of the tuple to a directory
-specified as the second element of the tuple.
-Useful when DFT calculations are expensive and want to be kept
-for later use. The first element of the tuple can either be a
-single file name, or a list of several. Copied files will be
-prepended with the date and time with the format
-‘Year.Month.Day:Hour:Minute:Second:’.
-
build_mode (str) – default “bayesian”, run on-the-fly training.
-“direct” mode constructs GP model from a given list of frames, with
-FakeMD and FakeDFT. Each frame needs to have a global
-property called “target_atoms” specifying a list of atomic
-environments added to the GP model.
If OTF has store_dft_output set, then the specified DFT files will
-be copied with the current date and time prepended in the format
-‘Year.Month.Day:Hour:Minute:Second:’.
Calculates DFT forces on atoms in the current structure.
-
If OTF has store_dft_output set, then the specified DFT files will
-be copied with the current date and time prepended in the format
-‘Year.Month.Day:Hour:Minute:Second:’.
-
Calculates DFT forces on atoms in the current structure.
Compute the maximum cutoff compatible with a 3x3x3 supercell of a
structure. Called in the Structure constructor when
-setting the max_cutoff attribute, which is used to create local
-environments with arbitrarily large cutoff radii.
-
-
-
-
Parameters
-
cell (np.ndarray) – Bravais lattice vectors of the structure stored as
-rows of a 3x3 Numpy array.
-
-
Returns
-
-
Maximum cutoff compatible with a 3x3x3 supercell of the
Checks the forces of GP prediction assigned to the structure against a
-DFT calculation, and return a list of atoms which meet an absolute
-threshold abs_force_tolerance.
-
Can limit the total number of target atoms via max_atoms_added, and limit
-per species by max_by_species.
-
The max_atoms_added argument will ‘overrule’ the
-max by species; e.g. if max_atoms_added is 2 and max_by_species is {“H”:3},
-then at most two atoms total will be added.
-
Because adding atoms which are in configurations which are far outside
-of the potential energy surface may not always be
-desirable, a maximum force error can be passed in; atoms with
-
-
Parameters
-
-
abs_force_tolerance – If error exceeds this value, then return
-atom index
-
predicted_forces – Force predictions made by GP model
-
label_forces – “True” forces computed by DFT
-
structure – FLARE Structure
-
max_atoms_added – Maximum atoms to return
-
max_by_species – Limit to a maximum number of atoms by species
-
max_force_error – In order to avoid counting in highly unlikely
-configurations, if the error exceeds this, do not add atom
-
-
-
Returns
-
Bool indicating if any atoms exceeded the error
-threshold, and a list of indices of atoms which did sorted by their
-error.
Given an uncertainty tolerance and a structure decorated with atoms,
-species, and associated uncertainties, return those which are above a
-given threshold, agnostic to species.
-
If std_tolerance is negative, then the threshold used is the absolute
-value of std_tolerance.
-
If std_tolerance is positive, then the threshold used is
-std_tolerance * noise.
-
If std_tolerance is 0, then do not check.
-
-
Parameters
-
-
std_tolerance – If positive, multiply by noise to get cutoff. If
-negative, use absolute value of std_tolerance as cutoff.
-
noise – Noise variance parameter
-
structure (FLARE Structure) – Input structure
-
max_atoms_added – Maximum # of atoms to add
-
update_style – A string specifying the desired strategy for
-adding atoms to the training set. Current options are ``add_n’’, which
-adds the n = max_atoms_added highest-uncertainty atoms, and
-``threshold’’, which adds all atoms with uncertainty greater than
-update_threshold.
-
update_threshold – A float specifying the update threshold. Ignored
-if update_style is not set to ``threshold’’.
-
-
-
Returns
-
(True,[-1]) if no atoms are above cutoff, (False,[…]) if at
-least one atom is above std_tolerance, with the list indicating
-which atoms have been selected for the training set.
Checks the stds of GP prediction assigned to the structure, returns a
-list of atoms which either meet an absolute threshold or a relative
-threshold defined by rel_std_tolerance * noise. Can limit the
-total number of target atoms via max_atoms_added, and limit per species
-by max_by_species.
-
The max_atoms_added argument will ‘overrule’ the
-max by species; e.g. if max_atoms_added is 2 and max_by_species is {“H”:3},
-then at most two atoms will be added.
-
-
Parameters
-
-
rel_std_tolerance – Multiplied by noise to get a lower
-bound for the uncertainty threshold defined relative to the model.
-
abs_std_tolerance – Used as an absolute lower bound for the
-uncertainty threshold.
-
noise – Noise hyperparameter for model, used to define relative
-uncertainty cutoff.
-
structure – FLARE structure decorated with
-uncertainties in structure.stds.
-
max_atoms_added – Maximum number of atoms to return from structure.
-
max_by_species – Dictionary describing maximum number of atoms to
-return by species (e.g. {‘H’:1,’He’:2} will return at most 1 H and 2 He
-atoms.)
-
-
-
Returns
-
Bool indicating if any atoms exceeded the uncertainty
-threshold, and a list of indices of atoms which did, sorted by their
-uncertainty.
Given a structure and a dictionary formatted as {“Symbol”:int,
-..} describing a number of atoms per element, return a sorted list of
-indices corresponding to a random subset of atoms by species
-:param frame:
-:param predict_atoms_by_species:
-:return:
List of what needs to be calculated. Can be any combination
-of ‘energy’, ‘forces’, ‘stress’, ‘dipole’, ‘charges’, ‘magmom’
-and ‘magmoms’.
-
-
system_changes: list of str
List of what has changed since last calculation. Can be
-any combination of these six: ‘positions’, ‘numbers’, ‘cell’,
-‘pbc’, ‘initial_charges’ and ‘initial_magmoms’.
-
-
-
Subclasses need to implement this, but can ignore properties
-and system_changes if they want. Calculated properties should
-be inserted into results dictionary like shown in this dummy
-example:
Run MD with LAMMPS based on the ase.md.md.MolecularDynamics.
-It includes using LAMMPS_MOD to run multiple steps, and supports
-Bayesian active learning with flare.
Back up the current trajectory into .xyz file. The atomic positions,
-velocities, forces and uncertainties are read from lammps trajectory.
-The step, potential energy and stress are read from thermo.txt
-
-
Parameters
-
curr_trj (list[ase.Atoms]) – lammps trajectory of current run read
-by ASE.
Run lammps until the uncertainty interrupts. Notice this method neither runs
-only a single MD step, nor finishes all the N_steps. The MD exits only when
-1) the maximal atomic uncertainty goes beyond std_tolerance, or
-2) all the N_steps are finished without uncertainty beyond std_tolerance.
-
-
Parameters
-
-
std_tolerance (float) – the threshold for atomic uncertainty, above which the
-MD will be interrupted and DFT will be called.
A modified ASE LAMMPS calculator based on ase.lammpsrun.LAMMPS,
-to allow for more flexible input parameters, including compute,
-region, dump, fix/nvt, fix/npt etc.
LAMMPS stress tensor = virial + kinetic,
-kinetic = sum(m_k * v_ki * v_kj) / V.
-We subtract the kinetic term and keep only the virial term.
-In the calculator, results[“stress”] += kinetic_atoms gives virial.
Returns distances, coordinates, species of atoms, and indices of neighbors
-in the 2-body local environment. This method is implemented outside
-the AtomicEnvironment class to allow for njit acceleration with Numba.
-
-
Parameters
-
-
positions (np.ndarray) – Positions of atoms in the structure.
-
atom (int) – Index of the central atom of the local environment.
-
cell (np.ndarray) – 3x3 array whose rows are the Bravais lattice vectors of the
-cell.
-
cutoff_2 (np.ndarray) – 2-body cutoff radius.
-
species (np.ndarray) – Numpy array of species represented by their atomic numbers.
-
nspecie – number of atom types to define bonds
-
species_mask – mapping from atomic number to atom types
-
twobody_mask – mapping from the types of end atoms to bond types
-
-
-
Type
-
int
-
-
Type
-
np.ndarray
-
-
Type
-
np.ndarray
-
-
Returns
-
Tuple of arrays describing pairs of atoms in the 2-body local
-environment.
-
bond_array_2: Array containing the distances and relative
-coordinates of atoms in the 2-body local environment. First column
-contains distances, remaining columns contain Cartesian coordinates
-divided by the distance (with the origin defined as the position of the
-central atom). The rows are sorted by distance from the central atom.
-
bond_positions_2: Coordinates of atoms in the 2-body local environment.
-
etypes: Species of atoms in the 2-body local environment represented by
-their atomic number.
-
bond_indices: Structure indices of atoms in the local environment.
Returns distances and coordinates of triplets of atoms in the
-3-body local environment.
-
-
Parameters
-
-
bond_array_2 (np.ndarray) – 2-body bond array.
-
bond_positions_2 (np.ndarray) – Coordinates of atoms in the 2-body local
-environment.
-
ctype – atomic number of the center atom
-
cutoff_3 (np.ndarray) – 3-body cutoff radius.
-
nspecie – number of atom types to define bonds
-
species_mask – mapping from atomic number to atom types
-
cut3b_mask – mapping from the types of end atoms to bond types
-
-
-
Type
-
int
-
-
Type
-
int
-
-
Type
-
np.ndarray
-
-
Type
-
np.ndarray
-
-
Returns
-
Tuple of 4 arrays describing triplets of atoms in the 3-body local
-environment.
-
bond_array_3: Array containing the distances and relative
-coordinates of atoms in the 3-body local environment. First column
-contains distances, remaining columns contain Cartesian coordinates
-divided by the distance (with the origin defined as the position of the
-central atom). The rows are sorted by distance from the central atom.
-
cross_bond_inds: Two dimensional array whose row m contains the indices
-of atoms n > m that are within a distance cutoff_3 of both atom n and the
-central atom.
-
cross_bond_dists: Two dimensional array whose row m contains the
-distances from atom m of atoms n > m that are within a distance cutoff_3
-of both atom n and the central atom.
-
triplet_counts: One dimensional array of integers whose entry m is the
-number of atoms that are within a distance cutoff_3 of atom m.
For multi-component systems, the configurational space can be highly complicated.
-One may want to use different hyper-parameters and cutoffs for different interactions,
-or do constraint optimisation for hyper-parameters.
-
To use more hyper-parameters, we need special kernel function that can differentiate different
-pairs, triplets and other descriptors and determine which number to use for what interaction.
-
This kernel can be enabled by using the hyps_mask argument of the GaussianProcess class.
-It contains multiple arrays to describe how to break down the array of hyper-parameters and
-apply them when computing the kernel. Detail descriptions of this argument can be seen in
-kernel/mc_sephyps.py.
-
The ParameterHelper class is to generate the hyps_mask with a more human readable interface.
In this example, four atomic species are involved. There are many kinds
-of twobodys and threebodys. But we only want to use eight different signal variance
-and length-scales.
-
In order to do so, we first define all the twobodys to be group “twobody0”, by
-listing “-” as the first element in the twobody argument. The second
-element O-O is then defined to be group “twobody1”. Note that the order
-matters here. The later element overrides the ealier one. If
-twobodys=[[‘O’, ‘O’], [‘*’, ‘*’]], then all twobodys belong to group “twobody1”.
-
Similarly, O-O-O is defined as threebody1, while all remaining ones
-are left as threebody0.
-
The hyperpameters for each group is listed in the order of
-[sig, ls, cutoff] in the parameters argument. So in this example,
-O-O interaction will use [2, 0.2, 2] as its sigma, length scale, and
-cutoff.
-
For threebody, the parameter arrays only come with two elements. So there
-is no cutoff associated with threebody0 or threebody1; instead, a universal
-cutoff is used, which is defined as ‘cutoff_threebody’.
-
The constraints argument define which hyper-parameters will be optimized.
-True for optimized and false for being fixed.
-
Here are a couple more simple examples.
-
Define a 5-parameter 2+3 kernel (1, 0.5, 1, 0.5, 0.05)
name (str) – the name use for indexing. can be anything but “*”
-
element_list (list) – list of elements
-
parameters (list) – corresponding parameters for this group
-
atomic_str (bool) – whether the elements in element_list are
-specified by group names or periodic table element names.
-
-
-
-
The function is helped to define different groups for specie/twobody/threebody
-/3b cutoff/manybody terms. This function can be used for many times.
-The later one always overrides the former one.
-
The name of the group has to be unique string (but not “*”), that
-define a group of species or twobodys, etc. If the same name is used,
-in two function calls, the definitions of the group will be merged.
-Both calls will be effective.
-
element_list has to be a list of atomic elements, or a list of
-specie group names (which should be defined in previous calls), or “*”.
-“*” will loop the function over all previously defined species.
-It has to be two elements for twobody/3b cutoff/manybody term, or
-three elements for threebody. For specie group definition, it can be
-as many elements as you want.
-
If multiple define_group calls have conflict with element, the later one
-has higher priority. For example, twobody 1-2 are defined as group1 in
-the first call, and as group2 in the second call. In the end, the twobody
-will be left as group2.
Separate all possible types of twobodys, threebodys, manybody.
-One type per group. And fill in either universal ls and sigma from
-pre-defined parameters from set_parameters(“sigma”, ..) and set_parameters(“ls”, ..)
-or random parameters if random is True.
So the first twobody defined will be group twobody0, second one will be
-group twobody1. For specie, it will define all the listed elements as
-groups with only one element with their original name.
-
If the definition_list is a dictionary, it is equivalent to
It is not recommended to use the dictionary mode, especially when
-the group definitions are conflicting with each other. There is no
-guarantee that the priority order is the same as you want.
-
Unlike ParameterHelper.define_group(), it can only be called once for each
-group_type, and not after any ParameterHelper.define_group() calls.
The name of parameters can be the group name previously defined in
-define_group or list_groups function. Aside from the group name,
-noise, cutoff_twobody, cutoff_threebody, and
-cutoff_manybody are reserved for noise parmater
-and universal cutoffs, while sigma and lengthscale are
-reserved for universal signal variances and length scales.
-
For non-reserved keys, the value should be a list of 2 to 3 elements,
-corresponding to the sigma, lengthscale and cutoff (if the third one
-is defined). For reserved keys, the value should be a float number.
-
The parameter_dict and constraints should use the same set of keys.
-If a key in constraints is not used in parameter_dict, it will be ignored.
-
The value in the constraints can be either a single bool, which apply
-to all parameters, or list of bools that apply to each parameter.
opt (bool, list) – whether to optimize the parameter or not
-
-
-
-
The name of parameters can be the group name previously defined in
-define_group or list_groups function. Aside from the group name,
-noise, cutoff_twobody, cutoff_threebody, and
-cutoff_manybody are reserved for noise parmater
-and universal cutoffs, while sigma and lengthscale are
-reserved for universal signal variances and length scales.
-
The optimization flag can be a single bool, which apply to all
-parameters under that name, or list of bools that apply to each
-parameter.
parameters (list) – the sigma, lengthscale, and cutoff of each group.
-
opt (bool, list) – whether to optimize the parameter or not
-
-
-
-
The name of parameters can be the group name previously defined in
-define_group or list_groups function. Aside from the group name,
-noise, cutoff_twobody, cutoff_threebody, and
-cutoff_manybody are reserved for noise parmater
-and universal cutoffs, while sigma and lengthscale are
-reserved for universal signal variances and length scales.
-
The parameter should be a list of 2-3 elements, for sigma,
-lengthscale (and cutoff if the third one is defined).
-
The optimization flag can be a single bool, which apply to all
-parameters, or list of bools that apply to each parameter.