lacan.mutate

mutate.py — Atom-level mutation operations for LACAN.

This module provides fine-grained single-step mutations: adding, removing, or changing individual atoms, bonds, or ring sizes. These are the operations can be considered as the exploitation operation in the adaptive GA.

Each reaction in mutate_smarts targets a specific chemotype change. The dictionary is compiled into mutate_ops (RDKit Reaction objects) at import time.

Protection

Before running any reaction, apply_mutations() calls reaction_touches_protected() to check whether the reaction’s SMARTS template would match a protected atom. If so, the entire reaction is skipped for that molecule.

Multiprocessing

Molecule atom/bond properties (including protection marks stored under the _lp property) would normally be lost when RDKit Mol objects are pickled for multiprocessing.Pool. apply_mutations_mols() works around this by serialising mols to SDF block strings (which preserve properties) before dispatching to worker processes.

lacan.mutate.mutate_smarts = {'addBr': '[H1,H2,H3:0]>>[*:0][Br]', 'addC': '[H1,H2,H3:0]>>[*:0][CH3]', 'addCO': '[CH2:0]>>[*:0]=[O]', 'addCl': '[H1,H2,H3:0]>>[*:0][Cl]', 'addF': '[H1,H2,H3:0]>>[*:0][F]', 'addN': '[H1,H2,H3:0]>>[*:0][NH2]', 'addO': '[H1,H2,H3:0]>>[*:0][OH]', 'aroCtoN': '[cH:0]>>[nH0:0]', 'aroNtoC': '[nH0X2:0]>>[cH:0]', 'arofuse5': '[aH1:0]:[a:1]-[*:2][*:3]-[!R;H1,H2,H3:4]>>[*:0]1~[*:1]~[*:2]~[*:3]~[*:4]1', 'arofuse6na1': '[aH1:0]:[a:1]-[*:2][*:3][!a:4]-[!R;H1,H2,H3:5]>>[*:0]1~[*:1]~[*:2]~[*:3]~[*:4]~[*:5]1', 'arofuse6na2': '[aH1:0]:[a:1]-[!a:2][*:3][*:4]-[!R;H1,H2,H3:5]>>[*:0]1~[*:1]~[*:2]~[*:3]~[*:4]~[*:5]1', 'bond1to2': '[H1,H2,H3:0]-[H1,H2,H3:1]>>[*:0]=[*:1]', 'bond2to1': '[A:0]=[A:1]>>[*:0]-[*:1]', 'bond2to3': '[!R;H1,H2:0]=[!R;H1,H2:1]>>[*:0]#[*:1]', 'bond3to2': '[*:0]#[*:1]>>[*:0]=[*:1]', 'close3ring': '[!R;H1,H2,H3:0][*:1][!R;H1,H2,H3:2]>>[*:0]1~[*:1]~[*:2]1', 'close4ring': '[!R;H1,H2,H3:0][*:1][*:2][!R;H1,H2,H3:3]>>[*:0]1~[*:1]~[*:2]~[*:3]1', 'close5ring': '[!R;H1,H2,H3:0][*:1][*:2][!a:3][!R;H1,H2,H3:4]>>[*:0]1~[*:1]~[*:2]~[*:3]~[*:4]1', 'close6ring1': '[!R;H1,H2,H3:0][*:1][*:2][!a:3][!a:4][!R;H1,H2,H3:5]>>[*:0]1~[*:1]~[*:2]~[*:3]~[*:4]~[*:5]1', 'close6ring2': '[!R;H1,H2,H3:0][!a:1][*:2][*:3][!a:4][!R;H1,H2,H3:5]>>[*:0]1~[*:1]~[*:2]~[*:3]~[*:4]~[*:5]1', 'contractAroNH': '[ar6:0]:[cH,nH0;r6:1]:[cH,nH0;r6:2]:[ar6:3]>>[*:0]:[nH]:[*:3].[*:1][*:2]', 'contractAroO': '[ar6:0]:[cH,nH0;r6:1]:[cH,nH0;r6:2]:[ar6:3]>>[*:0]:[o]:[*:3].[*:1][*:2]', 'contractAroS': '[ar6:0]:[cH,nH0;r6:1]:[cH,nH0;r6:2]:[ar6:3]>>[*:0]:[s]:[*:3].[*:1][*:2]', 'deleteD1': '[!$([nX3]):0][d1:1]>>[*:0].[*:1]', 'deleteD2': '[*:0][d2;A:1][*:2]>>[*:0][*:2].[*:1]', 'expandAroCC': '[ar5:0]:[nH,o,s;r5:1]:[ar5:2]>>[*:0]:[cH]:[cH]:[*:2].[*:1]', 'expandAroCN': '[ar5:0]:[nH,o,s;r5:1]:[ar5:2]>>[*:0]:[cH]:[n]:[*:2].[*:1]', 'insertC': '[*:0]-[*:1]>>[*:0]-[CH2]-[*:1]', 'insertN': '[*:0]-[*:1]>>[*:0]-[NH]-[*:1]', 'insertO': '[*:0]-[*:1]>>[*:0]-[O]-[*:1]', 'insertS': '[*:0]-[*:1]>>[*:0]-[S]-[*:1]', 'openring': '[R:0]@!:[R:1]>>([*:0].[*:1])', 'replaceC': '[!C;A;d4,d3,d2,d1;v4,v3,v2,v1:0]>>[C:0]', 'replaceN': '[!N;!$([CH0]);d3,d2,d1;A:0]>>[N:0]', 'replaceO': '[!O;!$([CH0]);$([*](-[*])(-[*]));d2,d1;A:0]>>[O:0]', 'replaceS': '[!S;!$([CH0]);$([*](-[*])(-[*]));d2,d1;A:0]>>[S:0]'}

Dictionary of named reaction SMARTS for single-step atom-level mutations.

Covers: atom addition (C/O/N/F/Cl/Br), ring contraction/expansion (5↔6, heteroatom swaps), chain insertion (C/N/O/S), element replacement, aromatic C↔N swaps, ring opening/closure (3–6-membered), ring fusion, bond order changes (single/double/triple), and degree-1/2 atom deletion.

lacan.mutate.mutate_ops = {'addBr': <rdkit.Chem.rdChemReactions.ChemicalReaction object>, 'addC': <rdkit.Chem.rdChemReactions.ChemicalReaction object>, 'addCO': <rdkit.Chem.rdChemReactions.ChemicalReaction object>, 'addCl': <rdkit.Chem.rdChemReactions.ChemicalReaction object>, 'addF': <rdkit.Chem.rdChemReactions.ChemicalReaction object>, 'addN': <rdkit.Chem.rdChemReactions.ChemicalReaction object>, 'addO': <rdkit.Chem.rdChemReactions.ChemicalReaction object>, 'aroCtoN': <rdkit.Chem.rdChemReactions.ChemicalReaction object>, 'aroNtoC': <rdkit.Chem.rdChemReactions.ChemicalReaction object>, 'arofuse5': <rdkit.Chem.rdChemReactions.ChemicalReaction object>, 'arofuse6na1': <rdkit.Chem.rdChemReactions.ChemicalReaction object>, 'arofuse6na2': <rdkit.Chem.rdChemReactions.ChemicalReaction object>, 'bond1to2': <rdkit.Chem.rdChemReactions.ChemicalReaction object>, 'bond2to1': <rdkit.Chem.rdChemReactions.ChemicalReaction object>, 'bond2to3': <rdkit.Chem.rdChemReactions.ChemicalReaction object>, 'bond3to2': <rdkit.Chem.rdChemReactions.ChemicalReaction object>, 'close3ring': <rdkit.Chem.rdChemReactions.ChemicalReaction object>, 'close4ring': <rdkit.Chem.rdChemReactions.ChemicalReaction object>, 'close5ring': <rdkit.Chem.rdChemReactions.ChemicalReaction object>, 'close6ring1': <rdkit.Chem.rdChemReactions.ChemicalReaction object>, 'close6ring2': <rdkit.Chem.rdChemReactions.ChemicalReaction object>, 'contractAroNH': <rdkit.Chem.rdChemReactions.ChemicalReaction object>, 'contractAroO': <rdkit.Chem.rdChemReactions.ChemicalReaction object>, 'contractAroS': <rdkit.Chem.rdChemReactions.ChemicalReaction object>, 'deleteD1': <rdkit.Chem.rdChemReactions.ChemicalReaction object>, 'deleteD2': <rdkit.Chem.rdChemReactions.ChemicalReaction object>, 'expandAroCC': <rdkit.Chem.rdChemReactions.ChemicalReaction object>, 'expandAroCN': <rdkit.Chem.rdChemReactions.ChemicalReaction object>, 'insertC': <rdkit.Chem.rdChemReactions.ChemicalReaction object>, 'insertN': <rdkit.Chem.rdChemReactions.ChemicalReaction object>, 'insertO': <rdkit.Chem.rdChemReactions.ChemicalReaction object>, 'insertS': <rdkit.Chem.rdChemReactions.ChemicalReaction object>, 'openring': <rdkit.Chem.rdChemReactions.ChemicalReaction object>, 'replaceC': <rdkit.Chem.rdChemReactions.ChemicalReaction object>, 'replaceN': <rdkit.Chem.rdChemReactions.ChemicalReaction object>, 'replaceO': <rdkit.Chem.rdChemReactions.ChemicalReaction object>, 'replaceS': <rdkit.Chem.rdChemReactions.ChemicalReaction object>}

Pre-compiled RDKit Reaction objects for all mutations in mutate_smarts.

lacan.mutate.apply_mutations_mols(mols, p, score_threshold, n_jobs=-1)[source]

Apply all mutations to a list of molecules, optionally in parallel.

This is the batch version of apply_mutations(). When n_jobs != 1 it uses multiprocessing.Pool and serialises mols via SDF blocks to preserve atom/bond properties.

Parameters:
  • mols (list of RDKit Mol objects)

  • p (LACAN profile dict)

  • score_threshold (minimum LACAN score for output molecules)

  • n_jobs (number of parallel workers; -1 uses all CPU cores)

  • mols). (Returns a flat list of RDKit Mol objects (all products from all input)

lacan.mutate.apply_mutations(mol, p, score_threshold, mode='all', protect_smarts=None)[source]

Apply single-step atom-level mutations to a molecule and return passing variants.

For each reaction in mutate_ops (or a randomly chosen one if mode="random"):

  1. Skip the reaction if it would touch an atom matching protect_smarts (checked via reaction_touches_protected()).

  2. Run the reaction on the molecule; sanitize each product.

  3. Score each product with score_mol_ignoring_protected_bonds().

  4. Keep products whose score exceeds score_threshold.

Deduplication is by InChIKey.

Parameters:
  • mol (RDKit Mol (may have protected bonds))

  • p (LACAN profile dict)

  • score_threshold (minimum LACAN score; set 0.0 to accept all LACAN-passing) – products, 0.8 for stricter drug-likeness filtering

  • mode ("all" (try every reaction) or "random" (one random) – reaction)

  • protect_smarts (SMARTS string; reactions touching any matching atom are) – skipped entirely. None = no exclusion (default).

  • objects. (Returns a deduplicated list of RDKit Mol)

Note

Products of reactions do not inherit the parent’s bond protection marks. This is intentional for normal use (the score filter selects good products), but mol_cleaner() bypasses this function in favour of _raw_mutations() to avoid discarding partial fixes.