invcryrep package

Submodules

invcryrep.config module

invcryrep.invcryrep module

class invcryrep.invcryrep.InvCryRep(atom_types=None, edge_indices=None, to_jimages=None, graph_method='econnn', check_results=False, optimizer='BFGS', fmax=0.2, steps=100)

Bases: object

Invertible Crystal Representation (SLICES and labeled quotient graph)

SLICES2formula(SLICES)

Convert a SLICES string to its chemical formula (to facilitate composition screening before SLICES2structure).

Parameters:

SLICES (str) – A SLICES string.

Returns:

Chemical formula.

Return type:

str

SLICES2structure(SLICES)

Convert a SLICES string back to its original crystal structure.

Parameters:

SLICES (str) – A SLICES string.

Returns:

A pymatgen Structure object. float: Energy per atom predicted with M3GNet.

Return type:

Structure

static all_distances(coords1, coords2)

Returns the distances between two lists of coordinates

Parameters:
  • coords1 – First set of Cartesian coordinates.

  • coords2 – Second set of Cartesian coordinates.

Returns:

2d array of Cartesian distances. E.g the distance between

coords1[i] and coords2[j] is distances[i,j]

Return type:

np.array

check_SLICES(SLICES, dupli_check=True)

Check if a slices string conforms to the proper syntax.

Parameters:
  • SLICES (str) – A SLICES string.

  • dupli_check (bool, optional) – Flag to indicate whether to check if a SLICES has duplicate edges. Defaults to True.

Returns:

Return True if a SLICES is syntaxlly valid.

Return type:

bool

check_element()

Make sure no atoms with atomic numbers higher than 86 (due to GFN-FF’s limitation).

Returns:

Return True if all atoms with Z < 87.

Return type:

bool

static check_structural_validity(str1)

Check the structural validity of a Structure with minimal distance > 0.5 Ang.

Parameters:

str1 (Structure) – Input Structure.

Returns:

Return True if Structure is structurally valid.

Return type:

bool

cif2structure_graph(string)

Convert a cif string to a structure_graph.

Parameters:

string (str) – String of a cif file.

Returns:

Pymatgen structure_graph object.

Return type:

StructureGraph

convert_graph()

Convert self.edge_indices, self.to_jimages into networkx format.

Returns:

x_dat. list: net_voltage(edge labels).

Return type:

list

static convert_params(x, ndim, cocycle_size, lattice_type, metric_tensor_std)

Extract metric tensor and cocycle rep from x vector.

Parameters:
  • x (np.array) – Ndarray of metric tensor components and colattice vectors.

  • ndim (int) – Dimensionality of crystal structure corresponding to the labeled quotient graph.

  • cocycle_size (int) – Size of the cocycle vectors.

  • lattice_type (int) – Lattice type. 1: a=b=c, 21: a!=b=c, 22: b!=a=c, 23: c!=a=b , 3: a!=b!=c.

  • metric_tensor_std (np.array) – Metric tensor of the barycentric embedding.

Returns:

Updated metric tensor based on colattice vectors, x. np.array: Cocycle rep, the bottom n-1 rows of the alpha matrix.

Return type:

np.array

file2structure_graph(filename)

Convert a file to a pymatgen structure_graph object.

Parameters:

filename (str) – Filename.

Returns:

Pymatgen structure_graph object.

Return type:

StructureGraph

from_SLICES(SLICES, fix_duplicate_edge=False)

Extract edge_indices, to_jimages and atom_types from decoding a SLICES string.

Parameters:
  • SLICES (due to RNN's difficulty in learning long SLICES) – SLICES string.

  • fix_duplicate_edge (bool, optional) – Flag to indicate whether to fix duplicate edges in

  • SLICES

Raises:
  • Exception – Error: wrong edge indices.

  • Exception – Error: wrong edge label.

from_cif(string)

Extract edge_indices, to_jimages and atom_types from a cif string.

Parameters:

string (str) – String of a cif file.

from_file(filename)

Extract edge_indices, to_jimages and atom_types from a file.

Parameters:

filename (str) – Filename.

from_structure(structure)

Extract edge_indices, to_jimages and atom_types from a pymatgen structure object.

Parameters:

structure (Structure) – A pymatgen Structure.

func(x, ndim, order, mat_target, colattice_inds, colattice_weights, cycle_rep, cycle_cocycle_I, num_nodes, shortest_path, spanning, uncovered_pair, uncovered_pair_lj, covered_pair_lj, vbond_param_ave_covered, vbond_param_ave, lattice_vectors_scaled, structure_species, angle_weight, repul, lattice_type, metric_tensor_std)

Objective function: sum squared differences between the inner products of the GFN-FF predicted geometry and the associated inner products (gjk) of the edges in the non-barycentric embedded net.

Parameters:
  • x (np.array) – Ndarray of metric tensor components and colattice vectors.

  • ndim (int) – Dimensionality of crystal structure corresponding to the labeled quotient graph.

  • order (int) – Number of nodes of the labeled quotient graph.

  • mat_target (np.array) – Inner product matrix target calculated with GFNFF predicted geometry.

  • colattice_inds (list) – keep track of all the valid colattice dot indices.

  • colattice_weights (list) – Colattice weights for bond or angle.

  • cycle_cocycle_I (np.array) – The inverse of B matrix.

  • num_nodes (int) – Number of nodes of the labeled quotient graph(duplicate! not deleted due to compatibility of HTS scripts, will be deleted in future).

  • shortest_path (list) – Shortest path of the spanning graph of the labeled quotient graph.

  • spanning (list) – Spanning graph of the labeled quotient graph.

  • uncovered_pair (list) – Atom pairs not covered by edges of the structure graph.

  • covered_pair_lj (list) – lj parameters for atom pairs covered by edges of the structure graph.

  • vbond_param_ave_covered (float) – Repulsive potential well depth of atom pairs covered by edges of the structure graph.

  • vbond_param_ave (float) – Repulsive potential well depth of atom pairs not covered by edges of the structure graph.

  • structure_species (list) – Atom symbols of the labeled quotient graph.

  • angle_weight (float) – Weight of angular terms in the object function.

  • repul (bool) – Flag to indicate whether repulsive potential is considered in the object function.

  • lattice_type (int) – Lattice type. 1: a=b=c, 21: a!=b=c, 22: b!=a=c, 23: c!=a=b , 3: a!=b!=c.

  • metric_tensor_std (np.array) – Metric tensor of the barycentric embedding.

Returns:

Value of the object function.

Return type:

float

func_check(x, ndim, order, mat_target, colattice_inds, colattice_weights, cycle_rep, cycle_cocycle_I, num_nodes, shortest_path, spanning, uncovered_pair, uncovered_pair_lj, covered_pair_lj, vbond_param_ave_covered, vbond_param_ave, lattice_vectors_scaled, structure_species, angle_weight, repul, lattice_type, metric_tensor_std)

Objective function: sum squared differences between the inner products of the GFN-FF predicted geometry and the associated inner products (gjk) of the edges in the non-barycentric embedded net. This version of func() will output debug info.

Parameters:
  • x (np.array) – Ndarray of metric tensor components and colattice vectors.

  • ndim (int) – Dimensionality of crystal structure corresponding to the labeled quotient graph.

  • order (int) – Number of nodes of the labeled quotient graph.

  • mat_target (np.array) – Inner product matrix target calculated with GFNFF predicted geometry.

  • colattice_inds (list) – keep track of all the valid colattice dot indices.

  • colattice_weights (list) – Colattice weights for bond or angle.

  • cycle_cocycle_I (np.array) – The inverse of B matrix.

  • num_nodes (int) – Number of nodes of the labeled quotient graph(duplicate! not deleted due to compatibility of HTS scripts, will be deleted in future).

  • shortest_path (list) – Shortest path of the spanning graph of the labeled quotient graph.

  • spanning (list) – Spanning graph of the labeled quotient graph.

  • uncovered_pair (list) – Atom pairs not covered by edges of the structure graph.

  • covered_pair_lj (list) – lj parameters for atom pairs covered by edges of the structure graph.

  • vbond_param_ave_covered (float) – Repulsive potential well depth of atom pairs covered by edges of the structure graph.

  • vbond_param_ave (float) – Repulsive potential well depth of atom pairs not covered by edges of the structure graph.

  • structure_species (list) – Atom symbols of the labeled quotient graph.

  • angle_weight (float) – Weight of angular terms in the object function.

  • repul (bool) – Flag to indicate whether repulsive potential is considered in the object function.

  • lattice_type (int) – Lattice type. 1: a=b=c, 21: a!=b=c, 22: b!=a=c, 23: c!=a=b , 3: a!=b!=c.

  • metric_tensor_std (np.array) – Metric tensor of the barycentric embedding.

Returns:

Value of the object function.

Return type:

float

get_canonical_SLICES(SLICES)

Convert a SLICES to its canonical form.

Parameters:

SLICES (str) – A SLICES string.

Returns:

The canonical SLICES string.

Return type:

str

static get_coordinates(arc_coord, num_nodes, shortest_path_spanning_graph, spanning)

Get fractional coordinates of atoms from fractional coordinates of edge vectors.

Parameters:
  • arc_coord (np.array) – Edge vectors (fractional coords) of a labeled quotient graph.

  • num_nodes (int) – Number of atoms(nodes) of a labeled quotient graph.

  • shortest_path_spanning_graph (list) – Shortest path of the spanning graph of a labeled quotient graph.

  • spanning (list) – Spanning graph of a labeled quotient graph.

Returns:

Fractional coordinates of atoms.

Return type:

np.array

get_covered_pair_lj()

Get atom pairs covered by edges of the structure graph stored in self.atom_types.

Returns:

Atom pairs covered by edges of the structure graph.

Return type:

list

get_dim(structure)

Get the dimension of a Structure.

Parameters:

structure (Structure) – A pymatgen Structure.

Returns:

The dimension of a Structure.

Return type:

int

get_inner_p_target(bond_scaling=1.05)

Get inner product matrix, colattice indices, colattice weights.

  1. Get inner_p_target(inner_p matrix obtained by gfnff).

  2. Get colattice_inds(keep track of all the valid colattice dot indices).

  3. Get colattice_weights(colattice weights for bond or angle).

Parameters:

bond_scaling (float, optional) – Bond scaling factor. Defaults to 1.05.

Returns:

Inner product matrix. list: Colattice indices. list: Colattice weights.

Return type:

np.array

get_inner_p_target_debug(bond_scaling=1.05)

Get inner product matrix, colattice indices, colattice weights with debug outputs.

  1. Get inner_p_target(inner_p matrix obtained by gfnff).

  2. Get colattice_inds(keep track of all the valid colattice dot indices).

  3. Get colattice_weights(colattice weights for bond or angle).

Parameters:

bond_scaling (float, optional) – Bond scaling factor. Defaults to 1.05.

Returns:

Inner product matrix. list: Colattice indices. list: Colattice weights.

Return type:

np.array

get_nbf_blist()
  1. Get nbf(neighbor list with atom types for xtb_mod). (2) Get blist(node indexes of the central unit cell edges in the 3*3*3 supercell).

Returns:

nbf. np.array: blist.

Return type:

str

get_rescaled_lattice_vectors(inner_p_target, inner_p_std, lattice_vectors_std, arc_coord_std)
Get rescaled_lattice_vectors based on the GFNFF bond lengths calculated using the

topological neighbor list as input.

Parameters:
  • inner_p_target (np.array) – Inner product matrix obtained by GFNFF.

  • inner_p_std (np.array) – Inner product matrix of the barycentric embedding.

  • lattice_vectors_std (np.array) – Lattice vectors of the barycentric embedding.

  • arc_coord_std (np.array) – Edge vectors (fractional coords) of the barycentric embedding.

Returns:

Rescaled lattice vectors.

Return type:

np.array

static get_uncovered_pair(graph)
Get atom pairs not covered by edges of the structure graph. Assuming that all atoms

has been included in graph.

Parameters:

graph (Graph) – Networkx graph.

Returns:

Atom pairs not covered by edges of the structure graph.

Return type:

list

get_uncovered_pair_lj(uncovered_pair)

Get the lj parameters for atom pairs not covered by edges of the structure graph.

Parameters:

uncovered_pair (list) – Atom pairs not covered by edges of the structure graph.

Returns:

lj parameters for atom pairs not covered by edges of the structure graph.

Return type:

list

static initialize_x_bounds(ndim, cocycle_rep, metric_tensor_std, lattice_type, delta_theta, delta_x, lattice_expand, lattice_shrink)

Initialize x vectors and bounds based on metric_tensor_std, lattice_type and other settings.

Parameters:
  • ndim (int) – Dimensionality of crystal structure corresponding to the labeled quotient graph.

  • cocycle_rep (np.array) – Cocycle rep, the bottom n-1 rows of the alpha matrix.

  • metric_tensor_std (np.array) – Metric tensor of the barycentric embedding.

  • lattice_type (int) – Lattice type. 1: a=b=c, 21: a!=b=c, 22: b!=a=c, 23: c!=a=b , 3: a!=b!=c.

  • delta_theta (float) – Angle change limit(deprecated! not deleted due to compatibility of HTS scripts, will be deleted in future).

  • delta_x (float) – Maximum x change allowed.

  • lattice_expand (float) – Maximum lattice expansion allowed.

  • lattice_shrink (float) – Maximum lattice shrinkage allowed.

Returns:

Intitial value of x. list: Upper and lower bounds of x.

Return type:

np.array

m3gnet_relax(struc)

Cell optimization using M3GNet IAPs (time limit is set to XX seconds to prevent buggy cell optimization that takes fovever to finish).

Parameters:

struc (Structure) – A pymatgen Structure object.

Returns:

Optimized pymatgen Structure object with M3GNet IAP. float: Energy per atom predicted with M3GNet.

Return type:

Structure

match_check(ori, opt, std, ltol=0.2, stol=0.3, angle_tol=5)
  1. Calculate match rates of structure (2) and (1) with respect to the original structure.

  2. Calculate topological(Jaccard) distances of structuregraph of structure (2) and (1) with respect to structuregraph of the original structure.

match_check2(ori, opt, std, ltol=0.2, stol=0.3, angle_tol=5)

Calculate match rates of structure (2) and (1) with respect to the original structure.

match_check3(ori, opt2, opt, std, ltol=0.2, stol=0.3, angle_tol=5)
  1. Calculate match rates of structure (3), (2) and (1) with respect to the original structure.

  2. Calculate topological distances of structuregraph of structure (3), (2) and (1) with respect to structuregraph of the original structure.

match_check4(ori, opt2, opt, std2, std, ltol=0.2, stol=0.3, angle_tol=5)
  1. Calculate match rates of structure (3), (2), (4) and (1) with respect to the

    original structure.

    1. Calculate topological distances of structuregraph of structure (3), (2), (4) and (1) with respect to structuregraph of the original structure.

Parameters:
  • ori (Structure) – Original Structure.

  • opt2 (Structure) – IAP-optimized Structure.

  • opt (Structure) – ZL*-Optimized Structure.

  • std2 (Structure) – IAP-optimized rescaled Structure.

  • std (Structure) – Rescaled Structure.

  • ltol (float, optional) – Fractional length tolerance. Default is 0.2.

  • stol (float, optional) – Site tolerance. Defined as the fraction of the average free length per atom := ( V / Nsites ) ** (1/3). Default is 0.3.

  • angle_tol (int, optional) – Angle tolerance in degrees. Default is 5.

Returns:

“1” if IAP-optimized Structure matches original Structure. “0” otherwise. str: “1” if ZL*-Optimized Structure matches original Structure. “0” otherwise. str: “1” if IAP-optimized rescaled Structure matches original Structure. “0” otherwise. str: “1” if Rescaled Structure matches original Structure. “0” otherwise. str: The topological distance between IAP-optimized Structure and original Structure. str: The topological distance between ZL*-Optimized Structure and original Structure. str: The topological distance between IAP-optimized Rescaled Structure and original Structure. str: The topological distance between Rescaled Structure and original Structure.

Return type:

str

structure2SLICES(structure, strategy=3)
Extract edge_indices, to_jimages and atom_types from a pymatgen structure object

then encode them into a SLICES string.

Parameters:
  • structure (Structure) – A pymatgen Structure.

  • strategy (int, optional) – Strategy number. Defaults to 3.

Returns:

A SLICES string.

Return type:

str

structure2SLICESAug(structure, strategy=3, num=200)

Convert Structure to SLICES and conduct data augmentation.

  1. extract edge_indices, to_jimages and atom_types from a pymatgen structure object

  2. encoding edge_indices, to_jimages and atom_types into multiple equalivent SLICES strings with a data augmentation scheme

Parameters:
  • structure (Structure) – A pymatgen Structure.

  • strategy (int, optional) – Strategy number. Defaults to 3.

  • num (int, optional) – Increase the dataset size by a magnitude of num. Defaults to 200.

Returns:

A list of num SLICES strings.

Return type:

list

structure2crystal_graph_rep(structure)
convert a pymatgen structure object into the crystal graph representation:

atom_types, edge_indices, to_jimages.

Parameters:

structure (_type_) – _description_

Returns:

Atom types. np.array: Edge indices. np.array: Edge labels.

Return type:

np.array

structure2structure_graph(structure)

Convert a pymatgen structure to a structure_graph.

Parameters:

structure (Structure) – A pymatgen Structure.

Returns:

A Pymatgen StructureGraph object.

Return type:

StructureGraph

to_4structures(bond_scaling=1.05, delta_theta=0.005, delta_x=0.45, lattice_shrink=1, lattice_expand=1.25, angle_weight=0.5, vbond_param_ave_covered=0.0, vbond_param_ave=0.01, repul=True)

Designed for benchmark.

to_SLICES()

Output a SLICES string based on self.atom_types & self.edge_indices & self.to_jimages. :returns: SLICES string. :rtype: str

to_relaxed_structure(bond_scaling=1.05, delta_theta=0.005, delta_x=0.45, lattice_shrink=1, lattice_expand=1.25, angle_weight=0.5, vbond_param_ave_covered=0.0, vbond_param_ave=0.01, repul=True)

Convert edge_indices, to_jimages and atom_types stored in the InvCryRep instance back to a pymatgen structure object: non-barycentric net embedding undergone cell optimization using M3GNet IAPs. If cell optimization failed, then raise error.

to_structure(bond_scaling=1.05, delta_theta=0.005, delta_x=0.45, lattice_shrink=1, lattice_expand=1.25, angle_weight=0.5, vbond_param_ave_covered=0.0, vbond_param_ave=0.01, repul=True)

Convert edge_indices, to_jimages and atom_types stored in the InvCryRep instance back to a pymatgen structure object: non-barycentric net embedding undergone cell optimization using M3GNet IAPs. If cell optimization failed, then output non-barycentric net embedding that matches bond lengths and bond angles predicted with modified GFN-FF.

to_structures(bond_scaling=1.05, delta_theta=0.005, delta_x=0.45, lattice_shrink=1, lattice_expand=1.25, angle_weight=0.5, vbond_param_ave_covered=0.0, vbond_param_ave=0.01, repul=True)

The inverse transform of the crystal graph of a SLICES string to its crystal structure. Convert edge_indices, to_jimages and atom_types stored in the InvCryRep instance back to 3 pymatgen structure objects and the energy per atom predicted with M3GNet.

  1. barycentric embedding net with rescaled lattice based on the average bond scaling factors

calculated with modified GFN-FF predicted geometry

  1. non-barycentric net embedding that matches bond lengths and bond angles predicted with

modified GFN-FF

(3) non-barycentric net embedding undergone cell optimization using M3GNet IAPs if cell optimization failed, then output (1) and (2)

Parameters:
  • bond_scaling (float, optional) – Bond scaling factor. Defaults to 1.05.

  • delta_theta (float) – Angle change limit(deprecated! not deleted due to compatibility of HTS scripts, will be deleted in future).

  • delta_x (float, optional) – Maximum x change allowed. Defaults to 0.45.

  • lattice_shrink (int, optional) – Maximum lattice shrinkage allowed. Defaults to 1.

  • lattice_expand (float, optional) – Maximum lattice expansion allowed. Defaults to 1.25.

  • angle_weight (float, optional) – Weight of angular terms in the object function. Defaults to 0.5.

  • vbond_param_ave_covered (float, optional) – Repulsive potential well depth of atom pairs covered by edges of the structure graph. Defaults to 0.00.

  • vbond_param_ave (float, optional) – Repulsive potential well depth of atom pairs not covered by edges of the structure graph. Defaults to 0.01.

  • repul (bool, optional) – Flag to indicate whether repulsive potential is considered in the object function. Defaults to True.

Returns:

[Rescaled Structure, ZL*-optimized Structure, IAP-optimized Structure] float: Energy per atom predicted with M3GNet.

Return type:

list

invcryrep.invcryrep.function_timeout(seconds: int)

Define a decorator that sets a time limit for a function call.

Parameters:

seconds (int) – Time limit.

Raises:

SystemExit – Timeout exception.

Returns:

Timeout Decorator.

Return type:

Decorator

invcryrep.tobascco_net module

class invcryrep.tobascco_net.Net(graph=None, dim=3, options=None)

Bases: object

add_edge(v1, v2, name)
add_edges_between(edge, N)
add_name()
add_to_array(vect, rep)

Works assuming the dimensions are the same

add_vertex(v)
all_edges()
assign_ip_matrix(mat, inds)

Get the colattice dot matrix from Builder.py. This is an inner product matrix of all the SBUs assigned to particular nodes.

barycentric_embedding()
check_linear_dependency(vect, vset)
convert_params(x, ndim, angle_inds, cocycle_size)
property cycle_cocycle
property cycle_cocycle_I
cycle_cocycle_check(vect)
debug_print(val, msg)
delete_edge(e)
edges_iter(data=True)
property eon_projection
get_2d_params()
get_3d_params()
get_cocycle_basis()

The orientation is important here!

get_cycle_basis()

Find the basis for the cycle vectors. The total number of cycle vectors in the basis is E - V + 1 (see n below). Once this number of cycle vectors is found, the program returns.

NB: Currently the cycle vectors associated with the lattice basis are included in the cycle basis - this is so that the embedding of the barycentric placement of the net works out properly. Thus the function self.get_lattice_basis() should be called prior to this.

get_index(edge)
get_lattice_basis()
get_metric_tensor()
get_voltage(cycle)
property graph
in_edges(vertex)
indices_with_voltage(volt)
insert_and_join(vfrom, vto, edge_label=None)
is_integral(vect)
iter_cycles(node=None, edge=None, cycle=[], used=[], nodes_visited=[], cycle_baggage=[], counter=0)

Recursive method to iterate over all cycles of a graph. NB: Not tested to ensure completeness, however it does find cycles. NB: Likely produces duplicate cycles along different starting points last point fixed but not tested

property kernel
property lattice_arcs
linear_independent_vectors(R, dim)
loop_edges()
property minimal
neighbours(vertex)
nodes_iter(data=True)

Oh man, fixing to networkx 2.0

This probably breaks a lot of stuff in the code. THANKS NETWORKX!!!!!!!1

property order
out_edges(vertex)
print_edge_count()
property projection
report_errors(fit)
report_errors_nlopt()
return_coeff(edges)
return_indices(edges)
property shape
simple_cycle_basis()

Cycle basis is constructed using a minimum spanning tree. This tree is traversed, and all the remaining edges are added to obtain the basis.

to_ind(str_obj)
vertex_positions(edges, used, pos={}, bad_ones={})

Recursive function to find the nodes in the unit cell. How it should be done:

Create a growing tree around the init placed vertex. Evaluate which vertices wind up in the unit cell and place them. Continue growing from those vertices in the unit cell until all are found.

vertices(vertex=None)
class invcryrep.tobascco_net.SystreDB(filename=None)

Bases: dict

A dictionary which reads a file of the same format read by Systre

Nd_chunks(list, dim)
gen_networkx_graph_format(edges, dim=3)

Take the edges from a systre db file and convert to a networkx graph readable format.

Assumes that the direction of the edge goes from [node1] —> [node2]

gen_sage_graph_format(edges)

Take the edges from a systre db file and convert to sage graph readable format.

Assumes that the direction of the edge goes from [node1] —> [node2]

get_key(block)
get_name(block)
read_store_file(file=None)

Reads and stores the nets in the self.file file. Note, this is specific to a systre.arc file and may be subject to change in the future depending on the developments ODF makes on Systre.

Module contents