Geometry and Topology¶
CalcFlow provides a layered geometry API: a simple Geometry for XYZ data, a Trajectory for multi-frame sequences, and an AnnotatedGeometry that co-locates atom coordinates with calculation results.
Geometry¶
Geometry is an immutable collection of atoms with a comment line — the canonical representation of a molecular structure.
Loading from file¶
from calcflow import Geometry
geom = Geometry.from_xyz_file("molecule.xyz")
print(geom.num_atoms) # int
print(geom.unique_elements) # set[str] — e.g. {"C", "H", "O"}
Loading from strings¶
lines = [
"O 0.000000 0.000000 0.119748",
"H 0.000000 0.756950 -0.478993",
"H 0.000000 -0.756950 -0.478993",
]
geom = Geometry.from_lines(lines, source="water")
Accessing atoms¶
Atom fields: symbol (str), x, y, z (float).
Exporting¶
xyz_str = geom.to_xyz_str()
geom.to_xyz_file("output.xyz")
# Just the atom coordinate block (no count line or comment)
block = geom.get_coordinate_block()
Energy from comment line¶
ORCA writes the energy into the comment line of optimized geometry XYZ files. CalcFlow parses this automatically:
Trajectory¶
Trajectory wraps a sequence of Geometry frames from a multi-frame XYZ file (e.g. an optimization trajectory).
from calcflow import Trajectory
traj = Trajectory.from_xyz_file("optimization.xyz")
print(len(traj)) # number of frames
first = traj[0] # Geometry
last = traj[-1] # Geometry
for frame in traj:
print(frame.energy) # energy per frame if present in comment line
All frames in a Trajectory must have the same number and order of atoms — this is validated on construction.
AnnotatedGeometry¶
AnnotatedGeometry is a read-only view that pairs a Geometry with a CalculationResult. It gives atom-centric access to charges, spin densities, and excited-state populations without duplicating data.
from pathlib import Path
from calcflow import parse_qchem_output, AnnotatedGeometry
result = parse_qchem_output(Path("calculation.out").read_text())
ag = AnnotatedGeometry.from_result(result)
Per-atom ground-state properties¶
for i, atom in enumerate(ag):
print(f"Atom {i} ({atom.symbol}):")
print(f" Mulliken charge: {atom.charges.get('Mulliken'):+.4f}")
print(f" Hirshfeld charge: {atom.charges.get('Hirshfeld'):+.4f}")
if atom.spins:
print(f" Mulliken spin: {atom.spins.get('Mulliken'):+.4f}")
atom.charges and atom.spins are dicts keyed by method name ("Mulliken", "Hirshfeld", "CM5", "Loewdin").
Using the input geometry¶
By default AnnotatedGeometry.from_result() uses the final (optimized) geometry. To use the input geometry instead:
Excited-state views¶
For TDDFT results, you can get a per-state view that exposes hole/electron populations and transition charges per atom:
sv = ag.get_transition_state(state_number=1) # AnnotatedStateView | None
if sv:
for i, atom in enumerate(sv):
print(f"Atom {i} ({atom.symbol}):")
print(f" hole population: {atom.hole_population:.4f}")
print(f" electron population: {atom.electron_population:.4f}")
print(f" transition charge: {atom.transition_charge:.4f}")
For unrelaxed difference density matrix data:
sv = ag.get_unrelaxed_state(state_number=1)
if sv:
for i, atom in enumerate(sv):
print(f"Atom {i}: Δq = {atom.delta_charge:.4f}")
For ADC states:
AnnotatedTrajectory¶
AnnotatedTrajectory is a sequence of AnnotatedGeometry frames — useful for tracking how charges or populations evolve along a geometry optimization when you have a separate CalculationResult per step.
from calcflow import AnnotatedTrajectory
# results: list[CalculationResult], one per optimization step
at = AnnotatedTrajectory.from_results(results)
for frame in at:
print(frame[0].charges.get("Mulliken")) # Mulliken charge on atom 0 at this step
Topology¶
CalcFlow includes bond detection, molecule identification, and aromaticity analysis based on covalent radii from Pyykko (2009).
from calcflow import build_bond_graph, detect_molecules, classify_all_bonds, find_aromatic_atoms
atoms = geom.atoms
Bond graph¶
bond_graph = build_bond_graph(atoms, tolerance=0.4)
# bond_graph[i] = list of atom indices bonded to atom i
Molecule detection¶
Useful for multi-component systems (e.g. solute + counter-ions):
molecules = detect_molecules(atoms, tolerance=0.4)
# molecules = list of frozensets, each containing the atom indices of one molecule
for i, mol in enumerate(molecules):
symbols = [atoms[j].symbol for j in sorted(mol)]
print(f"Fragment {i}: {''.join(symbols)}")
Bond classification¶
bonds = classify_all_bonds(atoms, tolerance=0.4)
# bonds = dict mapping (i, j) tuple -> "single" | "double" | "triple" | "aromatic"
for (i, j), order in bonds.items():
print(f"Atoms {i}-{j}: {order}")
Aromatic atoms¶
aromatic = find_aromatic_atoms(atoms, bond_graph)
# aromatic = set of atom indices participating in aromatic rings
Element Data¶
CalcFlow bundles a periodic table (all 118 elements) with atomic masses, van der Waals radii, and Pyykko covalent radii. The topology functions use this internally, but you can access it directly:
from calcflow.constants.ptable import ELEMENT_DATA
carbon = ELEMENT_DATA["C"]
carbon.atomic_number # 6
carbon.atomic_mass # 12.011
carbon.atomic_radius_vdw_pm # van der Waals radius in pm
carbon.covalent_radius_single_pm # single-bond covalent radius (Pyykko 2009)
carbon.covalent_radius_double_pm
carbon.covalent_radius_triple_pm