invcryrep package¶
Submodules¶
invcryrep.config module¶
invcryrep.invcryrep module¶
- class invcryrep.invcryrep.InvCryRep(atom_types=None, edge_indices=None, to_jimages=None, graph_method='econnn', check_results=False, optimizer='BFGS', fmax=0.2, steps=100)¶
Bases:
object
Invertible Crystal Representation (SLICES and labeled quotient graph)
- SLICES2formula(SLICES)¶
Convert a SLICES string to its chemical formula (to facilitate composition screening before SLICES2structure).
- Parameters:
SLICES (str) – A SLICES string.
- Returns:
Chemical formula.
- Return type:
str
- SLICES2structure(SLICES)¶
Convert a SLICES string back to its original crystal structure.
- Parameters:
SLICES (str) – A SLICES string.
- Returns:
A pymatgen Structure object. float: Energy per atom predicted with M3GNet.
- Return type:
Structure
- static all_distances(coords1, coords2)¶
Returns the distances between two lists of coordinates
- Parameters:
coords1 – First set of Cartesian coordinates.
coords2 – Second set of Cartesian coordinates.
- Returns:
- 2d array of Cartesian distances. E.g the distance between
coords1[i] and coords2[j] is distances[i,j]
- Return type:
np.array
- check_SLICES(SLICES, dupli_check=True)¶
Check if a slices string conforms to the proper syntax.
- Parameters:
SLICES (str) – A SLICES string.
dupli_check (bool, optional) – Flag to indicate whether to check if a SLICES has duplicate edges. Defaults to True.
- Returns:
Return True if a SLICES is syntaxlly valid.
- Return type:
bool
- check_element()¶
Make sure no atoms with atomic numbers higher than 86 (due to GFN-FF’s limitation).
- Returns:
Return True if all atoms with Z < 87.
- Return type:
bool
- static check_structural_validity(str1)¶
Check the structural validity of a Structure with minimal distance > 0.5 Ang.
- Parameters:
str1 (Structure) – Input Structure.
- Returns:
Return True if Structure is structurally valid.
- Return type:
bool
- cif2structure_graph(string)¶
Convert a cif string to a structure_graph.
- Parameters:
string (str) – String of a cif file.
- Returns:
Pymatgen structure_graph object.
- Return type:
StructureGraph
- convert_graph()¶
Convert self.edge_indices, self.to_jimages into networkx format.
- Returns:
x_dat. list: net_voltage(edge labels).
- Return type:
list
- static convert_params(x, ndim, cocycle_size, lattice_type, metric_tensor_std)¶
Extract metric tensor and cocycle rep from x vector.
- Parameters:
x (np.array) – Ndarray of metric tensor components and colattice vectors.
ndim (int) – Dimensionality of crystal structure corresponding to the labeled quotient graph.
cocycle_size (int) – Size of the cocycle vectors.
lattice_type (int) – Lattice type. 1: a=b=c, 21: a!=b=c, 22: b!=a=c, 23: c!=a=b , 3: a!=b!=c.
metric_tensor_std (np.array) – Metric tensor of the barycentric embedding.
- Returns:
Updated metric tensor based on colattice vectors, x. np.array: Cocycle rep, the bottom n-1 rows of the alpha matrix.
- Return type:
np.array
- file2structure_graph(filename)¶
Convert a file to a pymatgen structure_graph object.
- Parameters:
filename (str) – Filename.
- Returns:
Pymatgen structure_graph object.
- Return type:
StructureGraph
- from_SLICES(SLICES, fix_duplicate_edge=False)¶
Extract edge_indices, to_jimages and atom_types from decoding a SLICES string.
- Parameters:
SLICES (due to RNN's difficulty in learning long SLICES) – SLICES string.
fix_duplicate_edge (bool, optional) – Flag to indicate whether to fix duplicate edges in
SLICES –
- Raises:
Exception – Error: wrong edge indices.
Exception – Error: wrong edge label.
- from_cif(string)¶
Extract edge_indices, to_jimages and atom_types from a cif string.
- Parameters:
string (str) – String of a cif file.
- from_file(filename)¶
Extract edge_indices, to_jimages and atom_types from a file.
- Parameters:
filename (str) – Filename.
- from_structure(structure)¶
Extract edge_indices, to_jimages and atom_types from a pymatgen structure object.
- Parameters:
structure (Structure) – A pymatgen Structure.
- func(x, ndim, order, mat_target, colattice_inds, colattice_weights, cycle_rep, cycle_cocycle_I, num_nodes, shortest_path, spanning, uncovered_pair, uncovered_pair_lj, covered_pair_lj, vbond_param_ave_covered, vbond_param_ave, lattice_vectors_scaled, structure_species, angle_weight, repul, lattice_type, metric_tensor_std)¶
Objective function: sum squared differences between the inner products of the GFN-FF predicted geometry and the associated inner products (gjk) of the edges in the non-barycentric embedded net.
- Parameters:
x (np.array) – Ndarray of metric tensor components and colattice vectors.
ndim (int) – Dimensionality of crystal structure corresponding to the labeled quotient graph.
order (int) – Number of nodes of the labeled quotient graph.
mat_target (np.array) – Inner product matrix target calculated with GFNFF predicted geometry.
colattice_inds (list) – keep track of all the valid colattice dot indices.
colattice_weights (list) – Colattice weights for bond or angle.
cycle_cocycle_I (np.array) – The inverse of B matrix.
num_nodes (int) – Number of nodes of the labeled quotient graph(duplicate! not deleted due to compatibility of HTS scripts, will be deleted in future).
shortest_path (list) – Shortest path of the spanning graph of the labeled quotient graph.
spanning (list) – Spanning graph of the labeled quotient graph.
uncovered_pair (list) – Atom pairs not covered by edges of the structure graph.
covered_pair_lj (list) – lj parameters for atom pairs covered by edges of the structure graph.
vbond_param_ave_covered (float) – Repulsive potential well depth of atom pairs covered by edges of the structure graph.
vbond_param_ave (float) – Repulsive potential well depth of atom pairs not covered by edges of the structure graph.
structure_species (list) – Atom symbols of the labeled quotient graph.
angle_weight (float) – Weight of angular terms in the object function.
repul (bool) – Flag to indicate whether repulsive potential is considered in the object function.
lattice_type (int) – Lattice type. 1: a=b=c, 21: a!=b=c, 22: b!=a=c, 23: c!=a=b , 3: a!=b!=c.
metric_tensor_std (np.array) – Metric tensor of the barycentric embedding.
- Returns:
Value of the object function.
- Return type:
float
- func_check(x, ndim, order, mat_target, colattice_inds, colattice_weights, cycle_rep, cycle_cocycle_I, num_nodes, shortest_path, spanning, uncovered_pair, uncovered_pair_lj, covered_pair_lj, vbond_param_ave_covered, vbond_param_ave, lattice_vectors_scaled, structure_species, angle_weight, repul, lattice_type, metric_tensor_std)¶
Objective function: sum squared differences between the inner products of the GFN-FF predicted geometry and the associated inner products (gjk) of the edges in the non-barycentric embedded net. This version of func() will output debug info.
- Parameters:
x (np.array) – Ndarray of metric tensor components and colattice vectors.
ndim (int) – Dimensionality of crystal structure corresponding to the labeled quotient graph.
order (int) – Number of nodes of the labeled quotient graph.
mat_target (np.array) – Inner product matrix target calculated with GFNFF predicted geometry.
colattice_inds (list) – keep track of all the valid colattice dot indices.
colattice_weights (list) – Colattice weights for bond or angle.
cycle_cocycle_I (np.array) – The inverse of B matrix.
num_nodes (int) – Number of nodes of the labeled quotient graph(duplicate! not deleted due to compatibility of HTS scripts, will be deleted in future).
shortest_path (list) – Shortest path of the spanning graph of the labeled quotient graph.
spanning (list) – Spanning graph of the labeled quotient graph.
uncovered_pair (list) – Atom pairs not covered by edges of the structure graph.
covered_pair_lj (list) – lj parameters for atom pairs covered by edges of the structure graph.
vbond_param_ave_covered (float) – Repulsive potential well depth of atom pairs covered by edges of the structure graph.
vbond_param_ave (float) – Repulsive potential well depth of atom pairs not covered by edges of the structure graph.
structure_species (list) – Atom symbols of the labeled quotient graph.
angle_weight (float) – Weight of angular terms in the object function.
repul (bool) – Flag to indicate whether repulsive potential is considered in the object function.
lattice_type (int) – Lattice type. 1: a=b=c, 21: a!=b=c, 22: b!=a=c, 23: c!=a=b , 3: a!=b!=c.
metric_tensor_std (np.array) – Metric tensor of the barycentric embedding.
- Returns:
Value of the object function.
- Return type:
float
- get_canonical_SLICES(SLICES)¶
Convert a SLICES to its canonical form.
- Parameters:
SLICES (str) – A SLICES string.
- Returns:
The canonical SLICES string.
- Return type:
str
- static get_coordinates(arc_coord, num_nodes, shortest_path_spanning_graph, spanning)¶
Get fractional coordinates of atoms from fractional coordinates of edge vectors.
- Parameters:
arc_coord (np.array) – Edge vectors (fractional coords) of a labeled quotient graph.
num_nodes (int) – Number of atoms(nodes) of a labeled quotient graph.
shortest_path_spanning_graph (list) – Shortest path of the spanning graph of a labeled quotient graph.
spanning (list) – Spanning graph of a labeled quotient graph.
- Returns:
Fractional coordinates of atoms.
- Return type:
np.array
- get_covered_pair_lj()¶
Get atom pairs covered by edges of the structure graph stored in self.atom_types.
- Returns:
Atom pairs covered by edges of the structure graph.
- Return type:
list
- get_dim(structure)¶
Get the dimension of a Structure.
- Parameters:
structure (Structure) – A pymatgen Structure.
- Returns:
The dimension of a Structure.
- Return type:
int
- get_inner_p_target(bond_scaling=1.05)¶
Get inner product matrix, colattice indices, colattice weights.
Get inner_p_target(inner_p matrix obtained by gfnff).
Get colattice_inds(keep track of all the valid colattice dot indices).
Get colattice_weights(colattice weights for bond or angle).
- Parameters:
bond_scaling (float, optional) – Bond scaling factor. Defaults to 1.05.
- Returns:
Inner product matrix. list: Colattice indices. list: Colattice weights.
- Return type:
np.array
- get_inner_p_target_debug(bond_scaling=1.05)¶
Get inner product matrix, colattice indices, colattice weights with debug outputs.
Get inner_p_target(inner_p matrix obtained by gfnff).
Get colattice_inds(keep track of all the valid colattice dot indices).
Get colattice_weights(colattice weights for bond or angle).
- Parameters:
bond_scaling (float, optional) – Bond scaling factor. Defaults to 1.05.
- Returns:
Inner product matrix. list: Colattice indices. list: Colattice weights.
- Return type:
np.array
- get_nbf_blist()¶
Get nbf(neighbor list with atom types for xtb_mod). (2) Get blist(node indexes of the central unit cell edges in the 3*3*3 supercell).
- Returns:
nbf. np.array: blist.
- Return type:
str
- get_rescaled_lattice_vectors(inner_p_target, inner_p_std, lattice_vectors_std, arc_coord_std)¶
- Get rescaled_lattice_vectors based on the GFNFF bond lengths calculated using the
topological neighbor list as input.
- Parameters:
inner_p_target (np.array) – Inner product matrix obtained by GFNFF.
inner_p_std (np.array) – Inner product matrix of the barycentric embedding.
lattice_vectors_std (np.array) – Lattice vectors of the barycentric embedding.
arc_coord_std (np.array) – Edge vectors (fractional coords) of the barycentric embedding.
- Returns:
Rescaled lattice vectors.
- Return type:
np.array
- static get_uncovered_pair(graph)¶
- Get atom pairs not covered by edges of the structure graph. Assuming that all atoms
has been included in graph.
- Parameters:
graph (Graph) – Networkx graph.
- Returns:
Atom pairs not covered by edges of the structure graph.
- Return type:
list
- get_uncovered_pair_lj(uncovered_pair)¶
Get the lj parameters for atom pairs not covered by edges of the structure graph.
- Parameters:
uncovered_pair (list) – Atom pairs not covered by edges of the structure graph.
- Returns:
lj parameters for atom pairs not covered by edges of the structure graph.
- Return type:
list
- static initialize_x_bounds(ndim, cocycle_rep, metric_tensor_std, lattice_type, delta_theta, delta_x, lattice_expand, lattice_shrink)¶
Initialize x vectors and bounds based on metric_tensor_std, lattice_type and other settings.
- Parameters:
ndim (int) – Dimensionality of crystal structure corresponding to the labeled quotient graph.
cocycle_rep (np.array) – Cocycle rep, the bottom n-1 rows of the alpha matrix.
metric_tensor_std (np.array) – Metric tensor of the barycentric embedding.
lattice_type (int) – Lattice type. 1: a=b=c, 21: a!=b=c, 22: b!=a=c, 23: c!=a=b , 3: a!=b!=c.
delta_theta (float) – Angle change limit(deprecated! not deleted due to compatibility of HTS scripts, will be deleted in future).
delta_x (float) – Maximum x change allowed.
lattice_expand (float) – Maximum lattice expansion allowed.
lattice_shrink (float) – Maximum lattice shrinkage allowed.
- Returns:
Intitial value of x. list: Upper and lower bounds of x.
- Return type:
np.array
- m3gnet_relax(struc)¶
Cell optimization using M3GNet IAPs (time limit is set to XX seconds to prevent buggy cell optimization that takes fovever to finish).
- Parameters:
struc (Structure) – A pymatgen Structure object.
- Returns:
Optimized pymatgen Structure object with M3GNet IAP. float: Energy per atom predicted with M3GNet.
- Return type:
Structure
- match_check(ori, opt, std, ltol=0.2, stol=0.3, angle_tol=5)¶
Calculate match rates of structure (2) and (1) with respect to the original structure.
Calculate topological(Jaccard) distances of structuregraph of structure (2) and (1) with respect to structuregraph of the original structure.
- match_check2(ori, opt, std, ltol=0.2, stol=0.3, angle_tol=5)¶
Calculate match rates of structure (2) and (1) with respect to the original structure.
- match_check3(ori, opt2, opt, std, ltol=0.2, stol=0.3, angle_tol=5)¶
Calculate match rates of structure (3), (2) and (1) with respect to the original structure.
Calculate topological distances of structuregraph of structure (3), (2) and (1) with respect to structuregraph of the original structure.
- match_check4(ori, opt2, opt, std2, std, ltol=0.2, stol=0.3, angle_tol=5)¶
- Calculate match rates of structure (3), (2), (4) and (1) with respect to the
original structure.
Calculate topological distances of structuregraph of structure (3), (2), (4) and (1) with respect to structuregraph of the original structure.
- Parameters:
ori (Structure) – Original Structure.
opt2 (Structure) – IAP-optimized Structure.
opt (Structure) – ZL*-Optimized Structure.
std2 (Structure) – IAP-optimized rescaled Structure.
std (Structure) – Rescaled Structure.
ltol (float, optional) – Fractional length tolerance. Default is 0.2.
stol (float, optional) – Site tolerance. Defined as the fraction of the average free length per atom := ( V / Nsites ) ** (1/3). Default is 0.3.
angle_tol (int, optional) – Angle tolerance in degrees. Default is 5.
- Returns:
“1” if IAP-optimized Structure matches original Structure. “0” otherwise. str: “1” if ZL*-Optimized Structure matches original Structure. “0” otherwise. str: “1” if IAP-optimized rescaled Structure matches original Structure. “0” otherwise. str: “1” if Rescaled Structure matches original Structure. “0” otherwise. str: The topological distance between IAP-optimized Structure and original Structure. str: The topological distance between ZL*-Optimized Structure and original Structure. str: The topological distance between IAP-optimized Rescaled Structure and original Structure. str: The topological distance between Rescaled Structure and original Structure.
- Return type:
str
- structure2SLICES(structure, strategy=3)¶
- Extract edge_indices, to_jimages and atom_types from a pymatgen structure object
then encode them into a SLICES string.
- Parameters:
structure (Structure) – A pymatgen Structure.
strategy (int, optional) – Strategy number. Defaults to 3.
- Returns:
A SLICES string.
- Return type:
str
- structure2SLICESAug(structure, strategy=3, num=200)¶
Convert Structure to SLICES and conduct data augmentation.
extract edge_indices, to_jimages and atom_types from a pymatgen structure object
encoding edge_indices, to_jimages and atom_types into multiple equalivent SLICES strings with a data augmentation scheme
- Parameters:
structure (Structure) – A pymatgen Structure.
strategy (int, optional) – Strategy number. Defaults to 3.
num (int, optional) – Increase the dataset size by a magnitude of num. Defaults to 200.
- Returns:
A list of num SLICES strings.
- Return type:
list
- structure2crystal_graph_rep(structure)¶
- convert a pymatgen structure object into the crystal graph representation:
atom_types, edge_indices, to_jimages.
- Parameters:
structure (_type_) – _description_
- Returns:
Atom types. np.array: Edge indices. np.array: Edge labels.
- Return type:
np.array
- structure2structure_graph(structure)¶
Convert a pymatgen structure to a structure_graph.
- Parameters:
structure (Structure) – A pymatgen Structure.
- Returns:
A Pymatgen StructureGraph object.
- Return type:
StructureGraph
- to_4structures(bond_scaling=1.05, delta_theta=0.005, delta_x=0.45, lattice_shrink=1, lattice_expand=1.25, angle_weight=0.5, vbond_param_ave_covered=0.0, vbond_param_ave=0.01, repul=True)¶
Designed for benchmark.
- to_SLICES()¶
Output a SLICES string based on self.atom_types & self.edge_indices & self.to_jimages. :returns: SLICES string. :rtype: str
- to_relaxed_structure(bond_scaling=1.05, delta_theta=0.005, delta_x=0.45, lattice_shrink=1, lattice_expand=1.25, angle_weight=0.5, vbond_param_ave_covered=0.0, vbond_param_ave=0.01, repul=True)¶
Convert edge_indices, to_jimages and atom_types stored in the InvCryRep instance back to a pymatgen structure object: non-barycentric net embedding undergone cell optimization using M3GNet IAPs. If cell optimization failed, then raise error.
- to_structure(bond_scaling=1.05, delta_theta=0.005, delta_x=0.45, lattice_shrink=1, lattice_expand=1.25, angle_weight=0.5, vbond_param_ave_covered=0.0, vbond_param_ave=0.01, repul=True)¶
Convert edge_indices, to_jimages and atom_types stored in the InvCryRep instance back to a pymatgen structure object: non-barycentric net embedding undergone cell optimization using M3GNet IAPs. If cell optimization failed, then output non-barycentric net embedding that matches bond lengths and bond angles predicted with modified GFN-FF.
- to_structures(bond_scaling=1.05, delta_theta=0.005, delta_x=0.45, lattice_shrink=1, lattice_expand=1.25, angle_weight=0.5, vbond_param_ave_covered=0.0, vbond_param_ave=0.01, repul=True)¶
The inverse transform of the crystal graph of a SLICES string to its crystal structure. Convert edge_indices, to_jimages and atom_types stored in the InvCryRep instance back to 3 pymatgen structure objects and the energy per atom predicted with M3GNet.
barycentric embedding net with rescaled lattice based on the average bond scaling factors
calculated with modified GFN-FF predicted geometry
non-barycentric net embedding that matches bond lengths and bond angles predicted with
modified GFN-FF
(3) non-barycentric net embedding undergone cell optimization using M3GNet IAPs if cell optimization failed, then output (1) and (2)
- Parameters:
bond_scaling (float, optional) – Bond scaling factor. Defaults to 1.05.
delta_theta (float) – Angle change limit(deprecated! not deleted due to compatibility of HTS scripts, will be deleted in future).
delta_x (float, optional) – Maximum x change allowed. Defaults to 0.45.
lattice_shrink (int, optional) – Maximum lattice shrinkage allowed. Defaults to 1.
lattice_expand (float, optional) – Maximum lattice expansion allowed. Defaults to 1.25.
angle_weight (float, optional) – Weight of angular terms in the object function. Defaults to 0.5.
vbond_param_ave_covered (float, optional) – Repulsive potential well depth of atom pairs covered by edges of the structure graph. Defaults to 0.00.
vbond_param_ave (float, optional) – Repulsive potential well depth of atom pairs not covered by edges of the structure graph. Defaults to 0.01.
repul (bool, optional) – Flag to indicate whether repulsive potential is considered in the object function. Defaults to True.
- Returns:
[Rescaled Structure, ZL*-optimized Structure, IAP-optimized Structure] float: Energy per atom predicted with M3GNet.
- Return type:
list
- invcryrep.invcryrep.function_timeout(seconds: int)¶
Define a decorator that sets a time limit for a function call.
- Parameters:
seconds (int) – Time limit.
- Raises:
SystemExit – Timeout exception.
- Returns:
Timeout Decorator.
- Return type:
Decorator
invcryrep.tobascco_net module¶
- class invcryrep.tobascco_net.Net(graph=None, dim=3, options=None)¶
Bases:
object
- add_edge(v1, v2, name)¶
- add_edges_between(edge, N)¶
- add_name()¶
- add_to_array(vect, rep)¶
Works assuming the dimensions are the same
- add_vertex(v)¶
- all_edges()¶
- assign_ip_matrix(mat, inds)¶
Get the colattice dot matrix from Builder.py. This is an inner product matrix of all the SBUs assigned to particular nodes.
- barycentric_embedding()¶
- check_linear_dependency(vect, vset)¶
- convert_params(x, ndim, angle_inds, cocycle_size)¶
- property cycle_cocycle¶
- property cycle_cocycle_I¶
- cycle_cocycle_check(vect)¶
- debug_print(val, msg)¶
- delete_edge(e)¶
- edges_iter(data=True)¶
- property eon_projection¶
- get_2d_params()¶
- get_3d_params()¶
- get_cocycle_basis()¶
The orientation is important here!
- get_cycle_basis()¶
Find the basis for the cycle vectors. The total number of cycle vectors in the basis is E - V + 1 (see n below). Once this number of cycle vectors is found, the program returns.
NB: Currently the cycle vectors associated with the lattice basis are included in the cycle basis - this is so that the embedding of the barycentric placement of the net works out properly. Thus the function self.get_lattice_basis() should be called prior to this.
- get_index(edge)¶
- get_lattice_basis()¶
- get_metric_tensor()¶
- get_voltage(cycle)¶
- property graph¶
- in_edges(vertex)¶
- indices_with_voltage(volt)¶
- insert_and_join(vfrom, vto, edge_label=None)¶
- is_integral(vect)¶
- iter_cycles(node=None, edge=None, cycle=[], used=[], nodes_visited=[], cycle_baggage=[], counter=0)¶
Recursive method to iterate over all cycles of a graph. NB: Not tested to ensure completeness, however it does find cycles. NB: Likely produces duplicate cycles along different starting points last point fixed but not tested
- property kernel¶
- property lattice_arcs¶
- linear_independent_vectors(R, dim)¶
- loop_edges()¶
- property minimal¶
- neighbours(vertex)¶
- nodes_iter(data=True)¶
Oh man, fixing to networkx 2.0
This probably breaks a lot of stuff in the code. THANKS NETWORKX!!!!!!!1
- property order¶
- out_edges(vertex)¶
- print_edge_count()¶
- property projection¶
- report_errors(fit)¶
- report_errors_nlopt()¶
- return_coeff(edges)¶
- return_indices(edges)¶
- property shape¶
- simple_cycle_basis()¶
Cycle basis is constructed using a minimum spanning tree. This tree is traversed, and all the remaining edges are added to obtain the basis.
- to_ind(str_obj)¶
- vertex_positions(edges, used, pos={}, bad_ones={})¶
Recursive function to find the nodes in the unit cell. How it should be done:
Create a growing tree around the init placed vertex. Evaluate which vertices wind up in the unit cell and place them. Continue growing from those vertices in the unit cell until all are found.
- vertices(vertex=None)¶
- class invcryrep.tobascco_net.SystreDB(filename=None)¶
Bases:
dict
A dictionary which reads a file of the same format read by Systre
- Nd_chunks(list, dim)¶
- gen_networkx_graph_format(edges, dim=3)¶
Take the edges from a systre db file and convert to a networkx graph readable format.
Assumes that the direction of the edge goes from [node1] —> [node2]
- gen_sage_graph_format(edges)¶
Take the edges from a systre db file and convert to sage graph readable format.
Assumes that the direction of the edge goes from [node1] —> [node2]
- get_key(block)¶
- get_name(block)¶
- read_store_file(file=None)¶
Reads and stores the nets in the self.file file. Note, this is specific to a systre.arc file and may be subject to change in the future depending on the developments ODF makes on Systre.