lacan.mutate
mutate.py — Atom-level mutation operations for LACAN.
This module provides fine-grained single-step mutations: adding, removing, or changing individual atoms, bonds, or ring sizes. These are the operations can be considered as the exploitation operation in the adaptive GA.
Each reaction in mutate_smarts targets a specific chemotype change.
The dictionary is compiled into mutate_ops (RDKit Reaction objects) at
import time.
Protection
Before running any reaction, apply_mutations() calls
reaction_touches_protected() to check whether the
reaction’s SMARTS template would match a protected atom. If so, the entire
reaction is skipped for that molecule.
Multiprocessing
Molecule atom/bond properties (including protection marks stored under the
_lp property) would normally be lost when RDKit Mol objects are pickled
for multiprocessing.Pool. apply_mutations_mols() works around this
by serialising mols to SDF block strings (which preserve properties) before
dispatching to worker processes.
- lacan.mutate.mutate_smarts = {'addBr': '[H1,H2,H3:0]>>[*:0][Br]', 'addC': '[H1,H2,H3:0]>>[*:0][CH3]', 'addCO': '[CH2:0]>>[*:0]=[O]', 'addCl': '[H1,H2,H3:0]>>[*:0][Cl]', 'addF': '[H1,H2,H3:0]>>[*:0][F]', 'addN': '[H1,H2,H3:0]>>[*:0][NH2]', 'addO': '[H1,H2,H3:0]>>[*:0][OH]', 'aroCtoN': '[cH:0]>>[nH0:0]', 'aroNtoC': '[nH0X2:0]>>[cH:0]', 'arofuse5': '[aH1:0]:[a:1]-[*:2][*:3]-[!R;H1,H2,H3:4]>>[*:0]1~[*:1]~[*:2]~[*:3]~[*:4]1', 'arofuse6na1': '[aH1:0]:[a:1]-[*:2][*:3][!a:4]-[!R;H1,H2,H3:5]>>[*:0]1~[*:1]~[*:2]~[*:3]~[*:4]~[*:5]1', 'arofuse6na2': '[aH1:0]:[a:1]-[!a:2][*:3][*:4]-[!R;H1,H2,H3:5]>>[*:0]1~[*:1]~[*:2]~[*:3]~[*:4]~[*:5]1', 'bond1to2': '[H1,H2,H3:0]-[H1,H2,H3:1]>>[*:0]=[*:1]', 'bond2to1': '[A:0]=[A:1]>>[*:0]-[*:1]', 'bond2to3': '[!R;H1,H2:0]=[!R;H1,H2:1]>>[*:0]#[*:1]', 'bond3to2': '[*:0]#[*:1]>>[*:0]=[*:1]', 'close3ring': '[!R;H1,H2,H3:0][*:1][!R;H1,H2,H3:2]>>[*:0]1~[*:1]~[*:2]1', 'close4ring': '[!R;H1,H2,H3:0][*:1][*:2][!R;H1,H2,H3:3]>>[*:0]1~[*:1]~[*:2]~[*:3]1', 'close5ring': '[!R;H1,H2,H3:0][*:1][*:2][!a:3][!R;H1,H2,H3:4]>>[*:0]1~[*:1]~[*:2]~[*:3]~[*:4]1', 'close6ring1': '[!R;H1,H2,H3:0][*:1][*:2][!a:3][!a:4][!R;H1,H2,H3:5]>>[*:0]1~[*:1]~[*:2]~[*:3]~[*:4]~[*:5]1', 'close6ring2': '[!R;H1,H2,H3:0][!a:1][*:2][*:3][!a:4][!R;H1,H2,H3:5]>>[*:0]1~[*:1]~[*:2]~[*:3]~[*:4]~[*:5]1', 'contractAroNH': '[ar6:0]:[cH,nH0;r6:1]:[cH,nH0;r6:2]:[ar6:3]>>[*:0]:[nH]:[*:3].[*:1][*:2]', 'contractAroO': '[ar6:0]:[cH,nH0;r6:1]:[cH,nH0;r6:2]:[ar6:3]>>[*:0]:[o]:[*:3].[*:1][*:2]', 'contractAroS': '[ar6:0]:[cH,nH0;r6:1]:[cH,nH0;r6:2]:[ar6:3]>>[*:0]:[s]:[*:3].[*:1][*:2]', 'deleteD1': '[!$([nX3]):0][d1:1]>>[*:0].[*:1]', 'deleteD2': '[*:0][d2;A:1][*:2]>>[*:0][*:2].[*:1]', 'expandAroCC': '[ar5:0]:[nH,o,s;r5:1]:[ar5:2]>>[*:0]:[cH]:[cH]:[*:2].[*:1]', 'expandAroCN': '[ar5:0]:[nH,o,s;r5:1]:[ar5:2]>>[*:0]:[cH]:[n]:[*:2].[*:1]', 'insertC': '[*:0]-[*:1]>>[*:0]-[CH2]-[*:1]', 'insertN': '[*:0]-[*:1]>>[*:0]-[NH]-[*:1]', 'insertO': '[*:0]-[*:1]>>[*:0]-[O]-[*:1]', 'insertS': '[*:0]-[*:1]>>[*:0]-[S]-[*:1]', 'openring': '[R:0]@!:[R:1]>>([*:0].[*:1])', 'replaceC': '[!C;A;d4,d3,d2,d1;v4,v3,v2,v1:0]>>[C:0]', 'replaceN': '[!N;!$([CH0]);d3,d2,d1;A:0]>>[N:0]', 'replaceO': '[!O;!$([CH0]);$([*](-[*])(-[*]));d2,d1;A:0]>>[O:0]', 'replaceS': '[!S;!$([CH0]);$([*](-[*])(-[*]));d2,d1;A:0]>>[S:0]'}
Dictionary of named reaction SMARTS for single-step atom-level mutations.
Covers: atom addition (C/O/N/F/Cl/Br), ring contraction/expansion (5↔6, heteroatom swaps), chain insertion (C/N/O/S), element replacement, aromatic C↔N swaps, ring opening/closure (3–6-membered), ring fusion, bond order changes (single/double/triple), and degree-1/2 atom deletion.
- lacan.mutate.mutate_ops = {'addBr': <rdkit.Chem.rdChemReactions.ChemicalReaction object>, 'addC': <rdkit.Chem.rdChemReactions.ChemicalReaction object>, 'addCO': <rdkit.Chem.rdChemReactions.ChemicalReaction object>, 'addCl': <rdkit.Chem.rdChemReactions.ChemicalReaction object>, 'addF': <rdkit.Chem.rdChemReactions.ChemicalReaction object>, 'addN': <rdkit.Chem.rdChemReactions.ChemicalReaction object>, 'addO': <rdkit.Chem.rdChemReactions.ChemicalReaction object>, 'aroCtoN': <rdkit.Chem.rdChemReactions.ChemicalReaction object>, 'aroNtoC': <rdkit.Chem.rdChemReactions.ChemicalReaction object>, 'arofuse5': <rdkit.Chem.rdChemReactions.ChemicalReaction object>, 'arofuse6na1': <rdkit.Chem.rdChemReactions.ChemicalReaction object>, 'arofuse6na2': <rdkit.Chem.rdChemReactions.ChemicalReaction object>, 'bond1to2': <rdkit.Chem.rdChemReactions.ChemicalReaction object>, 'bond2to1': <rdkit.Chem.rdChemReactions.ChemicalReaction object>, 'bond2to3': <rdkit.Chem.rdChemReactions.ChemicalReaction object>, 'bond3to2': <rdkit.Chem.rdChemReactions.ChemicalReaction object>, 'close3ring': <rdkit.Chem.rdChemReactions.ChemicalReaction object>, 'close4ring': <rdkit.Chem.rdChemReactions.ChemicalReaction object>, 'close5ring': <rdkit.Chem.rdChemReactions.ChemicalReaction object>, 'close6ring1': <rdkit.Chem.rdChemReactions.ChemicalReaction object>, 'close6ring2': <rdkit.Chem.rdChemReactions.ChemicalReaction object>, 'contractAroNH': <rdkit.Chem.rdChemReactions.ChemicalReaction object>, 'contractAroO': <rdkit.Chem.rdChemReactions.ChemicalReaction object>, 'contractAroS': <rdkit.Chem.rdChemReactions.ChemicalReaction object>, 'deleteD1': <rdkit.Chem.rdChemReactions.ChemicalReaction object>, 'deleteD2': <rdkit.Chem.rdChemReactions.ChemicalReaction object>, 'expandAroCC': <rdkit.Chem.rdChemReactions.ChemicalReaction object>, 'expandAroCN': <rdkit.Chem.rdChemReactions.ChemicalReaction object>, 'insertC': <rdkit.Chem.rdChemReactions.ChemicalReaction object>, 'insertN': <rdkit.Chem.rdChemReactions.ChemicalReaction object>, 'insertO': <rdkit.Chem.rdChemReactions.ChemicalReaction object>, 'insertS': <rdkit.Chem.rdChemReactions.ChemicalReaction object>, 'openring': <rdkit.Chem.rdChemReactions.ChemicalReaction object>, 'replaceC': <rdkit.Chem.rdChemReactions.ChemicalReaction object>, 'replaceN': <rdkit.Chem.rdChemReactions.ChemicalReaction object>, 'replaceO': <rdkit.Chem.rdChemReactions.ChemicalReaction object>, 'replaceS': <rdkit.Chem.rdChemReactions.ChemicalReaction object>}
Pre-compiled RDKit Reaction objects for all mutations in
mutate_smarts.
- lacan.mutate.apply_mutations_mols(mols, p, score_threshold, n_jobs=-1)[source]
Apply all mutations to a list of molecules, optionally in parallel.
This is the batch version of
apply_mutations(). Whenn_jobs != 1it usesmultiprocessing.Pooland serialises mols via SDF blocks to preserve atom/bond properties.- Parameters:
mols (list of RDKit Mol objects)
p (LACAN profile dict)
score_threshold (minimum LACAN score for output molecules)
n_jobs (number of parallel workers; -1 uses all CPU cores)
mols). (Returns a flat list of RDKit Mol objects (all products from all input)
- lacan.mutate.apply_mutations(mol, p, score_threshold, mode='all', protect_smarts=None)[source]
Apply single-step atom-level mutations to a molecule and return passing variants.
For each reaction in
mutate_ops(or a randomly chosen one ifmode="random"):Skip the reaction if it would touch an atom matching
protect_smarts(checked viareaction_touches_protected()).Run the reaction on the molecule; sanitize each product.
Score each product with
score_mol_ignoring_protected_bonds().Keep products whose score exceeds
score_threshold.
Deduplication is by InChIKey.
- Parameters:
mol (RDKit Mol (may have protected bonds))
p (LACAN profile dict)
score_threshold (minimum LACAN score; set 0.0 to accept all LACAN-passing) – products, 0.8 for stricter drug-likeness filtering
mode (
"all"(try every reaction) or"random"(one random) – reaction)protect_smarts (SMARTS string; reactions touching any matching atom are) – skipped entirely.
None= no exclusion (default).objects. (Returns a deduplicated list of RDKit Mol)
Note
Products of reactions do not inherit the parent’s bond protection marks. This is intentional for normal use (the score filter selects good products), but
mol_cleaner()bypasses this function in favour of_raw_mutations()to avoid discarding partial fixes.