Submitted:
28 February 2026
Posted:
03 March 2026
You are already at the latest version
Abstract
Keywords:
1. Introduction
Scope and Limitations
2. Methods
2.1. AI-Assisted Workflow
2.2. Custom SMILES Descriptor Calculator
2.3. Training Sets and Ensemble ML Models
2.4. Bioisosteric Modifications
2.5. External In Silico Validation
2.6. Iterative Lead Optimization
2.7. Property-Space MIC Estimation
3. Results
3.1. Model Performance and Screening
3.2. External Validation of the Initial Lead
3.3. Iterative Lead Optimization (Table 1)

3.4. Comprehensive Profile of M6-12
3.4.1. ML scoring
3.4.2. SwissADME
3.4.3. RDKit Cross-Validation (Table 2)
3.4.4. pkCSM ADMET
3.4.5. ChEMBL
3.5. Property-Space MIC Estimation
4. Discussion
4.1. The “Signaling First, Killing Later” Paradigm
4.2. Comparison with Existing AI-Driven Antibiotic Discovery Approaches
4.3. Contextualizing the pkCSM Hepatotoxicity Prediction
4.4. Scaffold Simplification and the Piperidine Pharmacophore
4.5. Why M6-12 Passes Both ML Models: A Structural Analysis
4.6. Cross-Platform Concordance
4.7. Interpretive Limitations of the Antibiotic-Likeness Model
5. Limitations
6. Conclusions
Author Contributions
Funding
Declaration of generative AI and AI-assisted technologies in the research process
Data Availability Statement
Conflicts of Interest
Abbreviations
Appendix A. ML Pipeline Code
#!/usr/bin/env python3
"""
ML Pipeline for “Signaling First, Killing Later” Antibiotic Discovery Author: Maxwel Adriano Abegg Environment: Python 3.12, scikit-learn 1.6, NumPy """
import re
import hashlib
import numpy as np
from sklearn.ensemble import RandomForestClassifier, GradientBoostingClassifier
from sklearn.svm import SVC
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import StratifiedKFold, cross_val_score
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import Pipeline
# ================================================================
# SECTION A.1: CUSTOM SMILES DESCRIPTOR CALCULATOR (170+ features)
# ================================================================
# A.1a. Functional Group Patterns (~40 features) FUNCTIONAL_GROUP_PATTERNS = { 'hydroxyl': r'[^O]O[^C=]|\(O\)', 'methoxy': r'OC[^(=)]', 'amine_primary': r'[^C]N[^(C+)]', 'amine_secondary': r'CN[^(C+)]', 'amine_tertiary': r'CN\(C\)C', 'carboxyl': r'C\(=O\)O[^C]', 'nitro': r'\[N\+\]\(=O\)\[O-\]', 'nitrile': r'C#N', 'fluorine': r'F', 'chlorine': r'Cl', 'bromine': r'Br', 'iodine': r'I', 'ester': r'C\(=O\)OC', 'amide': r'C\(=O\)N', 'sulfonamide': r'S\(=O\)\(=O\)N', 'sulfone': r'S\(=O\)\(=O\)[^N]', 'sulfoxide': r'S\(=O\)[^(=O)]', 'ketone': r'CC\(=O\)C', 'aldehyde': r'C\(=O\)[^CNOS]', 'thiol': r'[^=]S[^(=)]', 'ether': r'COC', 'phenol': r'c.*O[^C=]', 'aniline': r'c.*N[^(=)]', 'phosphate': r'P\(=O\)', 'lactam': r'C1.*C\(=O\)N.*1', 'lactone': r'C1.*C\(=O\)O.*1', 'urea': r'NC\(=O\)N', 'carbamate': r'OC\(=O\)N', 'guanidine': r'NC\(=N\)N', 'imide': r'C\(=O\)NC\(=O\)', 'hydrazine': r'NN', 'azo': r'N=N', 'isocyanate': r'N=C=O', 'azide': r'\[N-\]=\[N\+\]=N', 'epoxide': r'C1OC1', 'peroxide': r'OO', 'acyl_chloride': r'C\(=O\)Cl', 'anhydride': r'C\(=O\)OC\(=O\)', 'enol': r'C=CO', 'vinyl': r'C=C[^=]', } # A.1e. QS-Relevant Pharmacophore Motifs (~15 features) QS_MOTIF_PATTERNS = { 'lactone_ring': r'C1.*C\(=O\)O.*1', 'furanone': r'c1cc\(=O\)o1', 'phenolic_OH': r'c.*O[^C]', 'ahl_chain': r'CC\(=O\)N.*CCCC', 'ahl_short': r'CC\(=O\)N.*CC', 'quinolone_like': r'c1.*C\(=O\).*c.*N.*1', 'thiolactone': r'C1.*C\(=O\)S.*1', 'indole': r'c1ccc2\[nH\]ccc2c1', 'catechol': r'c1cc\(O\)c\(O\)cc1', 'diketopiperazine': r'C1NC\(=O\)CNC1=O', 'hydroxamic_acid': r'C\(=O\)NO', 'butyrolactone': r'C1CCC\(=O\)O1', 'homoserine_like': r'OC.*C\(=O\)N', 'fimbrolide_like': r'C=C1OC\(=O\)', 'pqs_like': r'c1.*\[nH\].*c\(=O\).*1', } # Atomic weights for MW calculation ATOMIC_WEIGHTS = { 'C': 12.011, 'N': 14.007, 'O': 15.999, 'H': 1.008, 'F': 18.998, 'Cl': 35.453, 'Br': 79.904, 'S': 32.065, 'P': 30.974, 'I': 126.904, 'Na': 22.990, 'K': 39.098, 'Si': 28.086, } # Wildman-Crippen LogP fragment contributions (simplified) CRIPPEN_FRAGMENTS = { 'C_aromatic': 0.1441, 'C_aliphatic': 0.1441, 'N_amine': -0.7567, 'N_aromatic': -0.3239, 'O_hydroxyl': -0.3567, 'O_ether': -0.2893, 'F': 0.4118, 'Cl': 0.6895, 'Br': 0.8813, 'S': 0.6237, 'H_polar': -0.2677, 'H_nonpolar': 0.1230, } # TPSA fragment contributions (Ertl et al., 2000) TPSA_FRAGMENTS = { 'OH': 20.23, 'NH': 26.02, 'NH2': 26.02, 'O_ether': 9.23, 'O_carbonyl': 17.07, 'N_amine': 26.02, 'N_aromatic': 12.89, 'N_amide': 25.46, 'S_thiol': 25.30, 'P_phosphate': 34.14, }
def parse_atoms_from_smiles(smiles): """
Parse SMILES string to extract atom list with indices."""
atoms = [] i = 0 while i < len(smiles): if smiles[i] == '[': j = smiles.index(']', i) atoms.append(smiles[i:j+1]) i = j + 1 elif smiles[i].isalpha(): if i + 1 < len(smiles) and smiles[i+1].islower() and smiles[i+1] != 'c': atoms.append(smiles[i:i+2]) i += 2 else: atoms.append(smiles[i]) i += 1 else: i += 1 return atoms
def count_atoms(smiles): """
Count heavy atoms from SMILES string."""
counts = {'C': 0, 'N': 0, 'O': 0, 'F': 0, 'S': 0, 'Cl': 0, 'Br': 0, 'P': 0, 'I': 0, 'Si': 0} atoms = parse_atoms_from_smiles(smiles) for atom in atoms: clean = atom.strip('[]+-0123456789@Hh') upper = clean.upper() if upper in counts: counts[upper] += 1 elif clean.lower() in ['c', 'n', 'o', 's']: counts[clean.upper()] += 1 return counts
def estimate_hydrogens(atom_counts, smiles): """
Estimate hydrogen count using valence rules."""
c, n, o, f, s, cl, br = (atom_counts.get(x, 0) for x in ['C', 'N', 'O', 'F', 'S', 'Cl', 'Br']) double_bonds = smiles.count('=') triple_bonds = smiles.count('#') ring_closures = len(set(re.findall(r'(?<!%)([1-9])', smiles))) h = (2 * c + 2) + n - f - cl - br - (2 * double_bonds) - (4 * triple_bonds) - (2 * ring_closures) return max(0, h)
def estimate_mw(atom_counts, h_count): """
Estimate molecular weight from atom counts."""
mw = sum(atom_counts.get(a, 0) * ATOMIC_WEIGHTS.get(a, 0) for a in atom_counts) mw += h_count * ATOMIC_WEIGHTS['H'] return mw
def estimate_logp(atom_counts, smiles, h_count): """
Estimate LogP via simplified Wildman-Crippen fragments."""
logp = 0.0 aromatic_c = sum(1 for ch in smiles if ch == 'c') aliphatic_c = atom_counts.get('C', 0) - aromatic_c logp += aromatic_c * CRIPPEN_FRAGMENTS['C_aromatic'] logp += aliphatic_c * CRIPPEN_FRAGMENTS['C_aliphatic'] logp += atom_counts.get('N', 0) * CRIPPEN_FRAGMENTS['N_amine'] logp += atom_counts.get('O', 0) * CRIPPEN_FRAGMENTS['O_hydroxyl'] logp += atom_counts.get('F', 0) * CRIPPEN_FRAGMENTS['F'] logp += atom_counts.get('Cl', 0) * CRIPPEN_FRAGMENTS['Cl'] logp += atom_counts.get('Br', 0) * CRIPPEN_FRAGMENTS['Br'] logp += atom_counts.get('S', 0) * CRIPPEN_FRAGMENTS['S'] polar_h = (len(re.findall(r'O[^=]', smiles)) + len(re.findall(r'N[^=]', smiles))) nonpolar_h = max(0, h_count - polar_h) logp += polar_h * CRIPPEN_FRAGMENTS['H_polar'] logp += nonpolar_h * CRIPPEN_FRAGMENTS['H_nonpolar'] return logp
def estimate_tpsa(smiles): """
Estimate TPSA from fragment contributions."""
tpsa = 0.0 tpsa += len(re.findall(r'[^O]O[^C=]|\(O\)', smiles)) * TPSA_FRAGMENTS['OH'] tpsa += len(re.findall(r'[^C]N[^(C+)]', smiles)) * TPSA_FRAGMENTS['NH2'] tpsa += len(re.findall(r'CN[^(C+)]', smiles)) * TPSA_FRAGMENTS['NH'] tpsa += len(re.findall(r'COC', smiles)) * TPSA_FRAGMENTS['O_ether'] tpsa += len(re.findall(r'C\(=O\)', smiles)) * TPSA_FRAGMENTS['O_carbonyl'] tpsa += len(re.findall(r'S[^(=)]', smiles)) * TPSA_FRAGMENTS.get('S_thiol', 0) return tpsa
def count_hbd(smiles): """
Count hydrogen bond donors (OH + NH groups)."""
oh = len(re.findall(r'[^O]O[^C=]|\(O\)', smiles)) nh = len(re.findall(r'N[^(=)]', smiles)) return oh + nh
def count_hba(smiles): """
Count hydrogen bond acceptors (N + O atoms)."""
atoms = count_atoms(smiles) return atoms.get('N', 0) + atoms.get('O', 0)
def count_rotatable_bonds(smiles): """
Estimate rotatable bonds from SMILES."""
clean = re.sub(r'\[.*?\]', 'X', smiles) clean = re.sub(r'[0-9]', '', clean) singles = clean.count('-') + len(re.findall(r'[A-Za-z][A-Za-z]', clean)) ring_bonds = len(set(re.findall(r'(?<!%)([1-9])', smiles))) double_bonds = smiles.count('=') triple_bonds = smiles.count('#') return max(0, singles - ring_bonds - double_bonds - triple_bonds)
def compute_ring_features(smiles): """
Compute ring-related features."""
features = [] ring_digits = set(re.findall(r'(?<!%)([1-9])', smiles)) total_rings = len(ring_digits) features.append(total_rings) aromatic_atoms = sum(1 for c in smiles if c.islower() and c.isalpha() and c not in 'lrn') features.append(min(aromatic_atoms // 4, total_rings)) # est. aromatic rings features.append(max(0, total_rings - features[-1])) # aliphatic rings hetero_in_ring = sum(1 for c in smiles if c in 'nos') features.append(hetero_in_ring) has_fused = 1 if total_rings > 1 and any( smiles.count(str(d)) > 2 for d in range(1, 10)) else 0 features.append(has_fused) ring_sizes = [] for d in ring_digits: positions = [i for i, c in enumerate(smiles) if c == d] if len(positions) >= 2: ring_sizes.append(positions[-1] - positions [0]) features.append(min(ring_sizes) if ring_sizes else 0) features.append(max(ring_sizes) if ring_sizes else 0) features.append(np.mean(ring_sizes) if ring_sizes else 0) features.append(len(ring_sizes)) features.append(1 if any(s <= 5 for s in ring_sizes) else 0) features.append(1 if any(s >= 8 for s in ring_sizes) else 0) features.append(1 if any(s == 6 for s in ring_sizes) else 0) return features # 12 features
def compute_morgan_hash(smiles, nbits=128, radius=2): """
Compute Morgan-like hash fingerprint from SMILES."""
bits = [0] * nbits atoms = parse_atoms_from_smiles(smiles) for i, atom in enumerate(atoms): for r in range(radius + 1): start = max(0, i - r) end = min(len(atoms), i + r + 1) env = ''.join(atoms[start:end]) hash_val = int(hashlib.md5(env.encode()).hexdigest(), 16) % nbits bits[hash_val] = 1 return bits
def compute_descriptors(smiles): """
Compute 170+ molecular descriptors from a SMILES string. Returns a numpy array of features. """
features = [] # A.1a. Functional Group Counts (~40 features) for name, pattern in FUNCTIONAL_GROUP_PATTERNS.items(): try: features.append(len(re.findall(pattern, smiles))) except re.error: features.append(0) # A.1b. Ring Analysis (~12 features) ring_feats = compute_ring_features(smiles) features.extend(ring_feats) # A.1c. Atom Counts (~15 features) atom_counts = count_atoms(smiles) for elem in ['C', 'N', 'O', 'F', 'S', 'Cl', 'Br', 'P', 'I']: features.append(atom_counts.get(elem, 0)) heavy = sum(atom_counts.values()) features.append(heavy) h_count = estimate_hydrogens(atom_counts, smiles) features.append(h_count) features.append(heavy + h_count) # total atoms features.append(atom_counts.get('C', 0) / max(heavy, 1)) # carbon fraction features.append(atom_counts.get('N', 0) / max(heavy, 1)) # nitrogen fraction features.append(atom_counts.get('O', 0) / max(heavy, 1)) # oxygen fraction # A.1d. Topological Features (~10 features) mw = estimate_mw(atom_counts, h_count) logp = estimate_logp(atom_counts, smiles, h_count) tpsa = estimate_tpsa(smiles) hbd = count_hbd(smiles) hba = count_hba(smiles) rotbonds = count_rotatable_bonds(smiles) features.extend([mw, logp, tpsa, hbd, hba, rotbonds]) features.append(mw / max(heavy, 1)) # MW per heavy atom features.append(hbd + hba) # total H-bond capacity aromatic_c = sum(1 for c in smiles if c == 'c') sp3_c = max(0, atom_counts.get('C', 0) - aromatic_c) fsp3 = sp3_c / max(atom_counts.get('C', 0), 1) features.append(fsp3) # Fraction Csp3 features.append(len(smiles)) # SMILES string length # A.1e. QS-Relevant Pharmacophore Motifs (~15 features) for name, pattern in QS_MOTIF_PATTERNS.items(): try: features.append(1 if re.search(pattern, smiles) else 0) except re.error: features.append(0) # A.1f. Morgan-like Hash Features (128 features) morgan_bits = compute_morgan_hash(smiles, nbits=128, radius=2) features.extend(morgan_bits) return np.array(features, dtype=np.float64)
# ================================================================
# SECTION A.2: TRAINING AND ENSEMBLE
# ================================================================
def build_ensemble(): """
Build four-model ensemble with balanced class weights."""
models = { 'RF-500': RandomForestClassifier( n_estimators=500, class_weight='balanced', random_state=42, n_jobs=-1 ), 'GBM-200': GradientBoostingClassifier( n_estimators=200, max_depth=4, learning_rate=0.05, random_state=42 ), 'SVM-RBF': Pipeline([ ('scaler', StandardScaler()), ('svm', SVC( kernel='rbf', class_weight='balanced', probability=True, random_state=42 )) ]), 'LogReg': Pipeline([ ('scaler', StandardScaler()), ('lr', LogisticRegression( class_weight='balanced', max_iter=1000, random_state=42 )) ]), } return models
def cross_validate_ensemble(models, X, y, n_splits=5): """
Run stratified cross-validation for each model."""
cv = StratifiedKFold(n_splits=n_splits, shuffle=True, random_state=42) results = {} for name, model in models.items(): scores = cross_val_score(model, X, y, cv=cv, scoring='roc_auc') results[name] = {'mean': scores.mean(), 'std': scores.std(), 'folds': scores} print(f"{name}: AUC = {scores.mean():.3f} +/- {scores.std():.3f}") return results
def train_ensemble(models, X, y): """
Train all models on the full training set."""
trained = {} for name, model in models.items(): model.fit(X, y) trained[name] = model return trained
def predict_consensus(trained_models, X): """
Consensus prediction: arithmetic mean of individual probabilities. Returns per-sample consensus P(positive class). """
probs = [] for name, model in trained_models.items(): probs.append(model.predict_proba(X)[:, 1]) return np.mean(probs, axis=0)
def predict_individual(trained_models, X): """
Return dict of individual model predictions."""
return {name: model.predict_proba(X)[:, 1] for name, model in trained_models.items()}
# ================================================================
# SECTION A.3: SCREENING AND TRIAGE
# ================================================================
def screen_compounds(smiles_list, trained_qs_models, trained_abx_models, qs_threshold=0.5, abx_threshold=0.5): """
Screen a list of SMILES against both QS and ABX ensembles. Returns tiered results with individual and consensus scores. """
results = [] for smi in smiles_list: desc = compute_descriptors(smi).reshape(1, -1) p_qs = predict_consensus(trained_qs_models, desc)[0] p_abx = predict_consensus(trained_abx_models, desc)[0] joint = p_qs * p_abx qs_indiv = predict_individual(trained_qs_models, desc) abx_indiv = predict_individual(trained_abx_models, desc) abx_over_05 = sum(1 for v in abx_indiv.values() if v [0] > 0.5) tier = 1 if p_qs >= qs_threshold else 2 results.append({ 'smiles': smi, 'P_QS': round(p_qs, 4), 'P_Abx': round(p_abx, 4), 'Joint': round(joint, 4), 'ABX_over_05': f"{abx_over_05}/4", 'Tier': tier, 'QS_individual': {k: round(v [0], 4) for k, v in qs_indiv.items()}, 'ABX_individual': {k: round(v [0], 4) for k, v in abx_indiv.items()}, }) return sorted(results, key=lambda x: -x['Joint'])
# ================================================================
# SECTION A.4: EXAMPLE USAGE # ================================================================ if __name__ == '__main__': # --- Example: QS training set structure --- # qs_positives: 87 known QS modulators (AHL analogs, furanone inhibitors, # phenolic modulators, PQS quinolone analogs, dietary polyphenols) # qs_negatives: 63 non-QS compounds (aliphatics, cardiovascular/CNS drugs, # amino acids, vitamins, herbicides) # X_qs = np.array([compute_descriptors(smi) for smi in all_qs_smiles]) # y_qs = np.array([1]*87 + [0]*63) # --- Example: ABX training set structure --- # abx_positives: 44 marketed antibiotics (14+ structural classes) # abx_negatives: 49 non-antibiotic drugs # X_abx = np.array([compute_descriptors(smi) for smi in all_abx_smiles]) # y_abx = np.array([1]*44 + [0]*49) # --- Build and validate --- # qs_models = build_ensemble() # qs_cv = cross_validate_ensemble(qs_models, X_qs, y_qs) # trained_qs = train_ensemble(qs_models, X_qs, y_qs) # abx_models = build_ensemble() # abx_cv = cross_validate_ensemble(abx_models, X_abx, y_abx) # trained_abx = train_ensemble(abx_models, X_abx, y_abx) # --- Screen M6-12 --- m6_12 = 'CNCc1c(F)cc(OC)c2c(OC)c3C(O)CNCC3c(O)c12' # results = screen_compounds([m6_12], trained_qs, trained_abx) # print(results) # --- Validate descriptor computation --- desc = compute_descriptors(m6_12) print(f"M6-12 descriptor vector: {len(desc)} features") print(f"Feature vector computed successfully.")
References
- Abegg, M. A. (2026). Exploratory AI-assisted in silico evidence that bacterial signaling molecules may occupy a drug-like pharmacokinetic space. Preprints, 2026020854. [CrossRef]
- Andersson, D. I., & Hughes, D. (2014). Microbiological effects of sublethal levels of antibiotics. Nat Rev Microbiol, 12, 465–478. [CrossRef]
- Baell, J. B., & Holloway, G. A. (2010). New Substructure Filters for Removal of Pan Assay Interference Compounds (PAINS) from Screening Libraries and for Their Exclusion in Bioassays. J Med Chem, 53, 2719–2740. [CrossRef]
- Beyersmann, P. G., Tomasch, J., Son, K., Stocker, R., Göker, M., Wagner-Döbler, I., Simon, M., & Brinkhoff, T. (2017). Dual function of tropodithietic acid as antibiotic and signaling molecule in global gene regulation of the probiotic bacterium Phaeobacter inhibens. Scientific Reports, 7(1), Article 730. [CrossRef]
- Brown, E. D., & Wright, G. D. (2016). Antibacterial drug discovery in the resistance era. Nature, 529, 336–343. [CrossRef]
- Cavasotto, C. N., & Scardino, V. (2022). Machine learning toxicity prediction: Latest advances by toxicity end point. ACS Omega, 7(51), 47536–47546. [CrossRef]
- Cesaro, A., Bagheri, M., Torres, M., Wan, F., & de la Fuente-Nunez, C. (2023). Deep learning tools to accelerate antibiotic discovery. Expert Opin Drug Discov, 18(11), 1245–1257. [CrossRef]
- Chen, M., Tung, C.-W., Shi, Q., Guo, L., Shi, L., Fang, H., Borlak, J., & Tong, W. (2014). A testing strategy to predict risk for drug-induced liver injury in humans using high-content screen assays and the ‘rule-of-two’ model. Archives of Toxicology, 88(7), 1439–1449. [CrossRef]
- Daina, A., Michielin, O., & Zoete, V. (2017). SwissADME: a free web tool to evaluate pharmacokinetics, drug-likeness and medicinal chemistry friendliness of small molecules. Sci Rep, 7, 42717. [CrossRef]
- Davies, J., & Davies, D. (2010). Origins and evolution of antibiotic resistance. Microbiol Mol Biol Rev, 74, 417–433. [CrossRef]
- Desai, J. V., & Mitchell, A. P. (2015). Candida albicans biofilm development and its genetic control. Microbiology Spectrum, 3(3), Article MB-0005-2014. [CrossRef]
- Fajardo, A., & Martínez, J. L. (2008). Antibiotics as signals that trigger specific bacterial responses. Curr Opin Microbiol, 11(2), 161–167. [CrossRef]
- Fernández-Delgado, M., Cernadas, E., Barro, S., & Amorim, D. (2014). Do we need hundreds of classifiers to solve real world classification problems? Journal of Machine Learning Research, 15(90), 3133–3181. [CrossRef]
- Feng, X., Marcon, V., Pisula, W., Hansen, M. R., Kirkpatrick, J., Grozema, F., Andrienko, D., Kremer, K., & Müllen, K. (2009). Towards high charge-carrier mobilities by rational design of the shape and periphery of discotics. Nat Mater, 8(5), 421–426. [CrossRef]
- Gaulton, A., Hersey, A., Nowotka, M., Bento, A. P., Chambers, J., Mendez, D., Mutowo, P., Atkinson, F., Bellis, L. J., Cibrián-Uhalte, E., et al. (2017). The ChEMBL database in 2017. Nucleic Acids Res, 45(D1), D945–D954. [CrossRef]
- Gfeller, D., Grosdidier, A., Wirth, M., Daina, A., Michielin, O., & Zoete, V. (2014). SwissTargetPrediction: a web server for target prediction of bioactive small molecules. Nucleic Acids Res, 42(W1), W32–W38. [CrossRef]
- Goh, E.-B., Yim, G., Tsui, W., McClure, J., Surette, M. G., & Davies, J. (2002). Transcriptional modulation of bacterial gene expression by subinhibitory concentrations of antibiotics. Proc Natl Acad Sci USA, 99(26), 17025–17030. [CrossRef]
- Guengerich, F. P. (2011). Mechanisms of drug toxicity and relevance to pharmaceutical development. Drug Metab Pharmacokinet, 26, 167–176. [CrossRef]
- Hagmann, W. K. (2008). The many roles for fluorine in medicinal chemistry. J Med Chem, 51, 4359–4369. [CrossRef]
- Henriksen, N. N. S. E., Lindqvist, L. L., Wibowo, M., Sonnenschein, E. C., Bentzon-Tilia, M., & Gram, L. (2022). Role is in the eye of the beholder—the multiple functions of the antibacterial compound tropodithietic acid produced by marine Rhodobacteraceae. FEMS Microbiol Rev, 46, fuac007. [CrossRef]
- Hentzer, M., Riedel, K., Rasmussen, T. B., Heydorn, A., Andersen, J. B., Parsek, M. R., Rice, S. A., Eberl, L., Molin, S., Høiby, N., Kjelleberg, S., & Givskov, M. (2002). Inhibition of quorum sensing in Pseudomonas aeruginosa biofilm bacteria by a halogenated furanone compound. Microbiology, 148(Pt 1), 87–102. [CrossRef]
- Hoffman, L. R., D’Argenio, D. A., MacCoss, M. J., Zhang, Z., Jones, R. A., & Miller, S. I. (2005). Aminoglycoside antibiotics induce bacterial biofilm formation. Nature, 436, 1171–1175. [CrossRef]
- Irwin, J. J., Sterling, T., Mysinger, M. M., Bolstad, E. S., & Coleman, R. G. (2012). ZINC: a free tool to discover chemistry for biology. J Chem Inf Model, 52(7), 1757–1768. [CrossRef]
- Kalia, V. C. (2013). Quorum sensing inhibitors: an overview. Biotechnol Adv, 31, 224–245. [CrossRef]
- Landrum, G. (2016). RDKit: Open-source cheminformatics. https://www.rdkit.org.
- Linares, J. F., Gustafsson, I., Baquero, F., & Martinez, J. L. (2006). Antibiotics as intermicrobial signaling agents instead of weapons. Proc Natl Acad Sci USA, 103, 19484–19489. [CrossRef]
- Liu, G.-Y., Yu, D., Fan, M.-M., Zhang, X., Jin, Z.-Y., Tang, C., & Liu, X.-F. (2024). Antimicrobial resistance crisis: Could artificial intelligence be the solution? Military Medical Research, 11(1), Article 7. [CrossRef]
- Lovering, F., Bikker, J., & Humblet, C. (2009). Escape from Flatland: increasing saturation as an approach to improving clinical success. J Med Chem, 52(21), 6752–6756. [CrossRef]
- McGovern, S. L., Helfand, B. T., Feng, B., & Shoichet, B. K. (2003). A specific mechanism of nonspecific inhibition. J Med Chem, 46(20), 4265–4272. [CrossRef]
- Meanwell, N. A. (2011). Synopsis of bioisosteres in drug design. J Med Chem, 54, 2529–2591. [CrossRef]
- Mugumbate, G., & Overington, J. P. (2015). The relationship between target-class and the physicochemical properties of antibacterial drugs. Bioorganic & Medicinal Chemistry, 23(16), 5218–5224. [CrossRef]
- Mulliner, D., Schmidt, F., Stolte, M., Spirkl, H.-P., Czich, A., & Amberg, A. (2016). Computational models for human and animal hepatotoxicity with a global application scope. Chemical Research in Toxicology, 29(5), 757–767. [CrossRef]
- Murray, C. J. L., Ikuta, K. S., Sharara, F., Swetschinski, L., Robles Aguilar, G., Gray, A., et al. (2022). Global burden of bacterial antimicrobial resistance in 2019: a systematic analysis. Lancet, 399(10325), 629–655. [CrossRef]
- O’Shea, R., & Moser, H. E. (2008). Physicochemical properties of antibacterial compounds. J Med Chem, 51, 2871–2878. [CrossRef]
- Patani, G. A., & LaVoie, E. J. (1996). Bioisosterism: a rational approach in drug design. Chem Rev, 96, 3147–3176. [CrossRef]
- Payne, D. J., Gwynn, M. N., Holmes, D. J., & Pompliano, D. L. (2007). Drugs for bad bugs: confronting the challenges of antibacterial discovery. Nat Rev Drug Discov, 6(1), 29–40. [CrossRef]
- Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., & Duchesnay, É. (2011). Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12, 2825–2830.
- Pires, D. E. V., Blundell, T. L., & Ascher, D. B. (2015). pkCSM: predicting small-molecule pharmacokinetic and toxicity properties using graph-based signatures. J Med Chem, 58(9), 4066–4072. [CrossRef]
- Purser, S., Moore, P. R., Swallow, S., & Gouverneur, V. (2008). Fluorine in medicinal chemistry. Chem Soc Rev, 37(2), 320–330. [CrossRef]
- Reisenbauer, J. C., Green, O., Franchino, A., Finkelstein, P., & Morandi, B. (2022). Late-stage diversification of indole skeletons through nitrogen atom insertion. Science, 377(6610), 1104–1109. [CrossRef]
- Richter, M. F., Drown, B. S., Riley, A. P., Garcia, A., Shirai, T., Svec, R. L., & Hergenrother, P. J. (2017). Predictive compound accumulation rules yield a broad-spectrum antibiotic. Nature, 545(7654), 299–304. [CrossRef]
- Ritchie, T. J., & Macdonald, S. J. F. (2009). The impact of aromatic ring count on compound developability – are too many aromatic rings a liability in drug design? Drug Discovery Today, 14(21–22), 1011–1020. [CrossRef]
- Sengupta, S., Chattopadhyay, M. K., & Grossart, H. P. (2013). Multifaceted roles of antibiotics and antibiotic resistance in nature. Front Microbiol, 4, 47. [CrossRef]
- Skindersoe, M. E., Alhede, M., Phipps, R., Yang, L., Jensen, P. O., Rasmussen, T. B., Bjarnsholt, T., Tolker-Nielsen, T., Høiby, N., & Givskov, M. (2008). Effects of antibiotics on quorum sensing in Pseudomonas aeruginosa. Antimicrob Agents Chemother, 52(10), 3648–3663. [CrossRef]
- Sterling, T., & Irwin, J. J. (2015). ZINC 15 – Ligand discovery for everyone. Journal of Chemical Information and Modeling, 55(11), 2324–2337. [CrossRef]
- Stokes, J. M., Yang, K., Swanson, K., Jin, W., Cubillos-Ruiz, A., Donghia, N. M., MacNair, C. R., French, S., Carfrae, L. A., Bloom-Ackermann, Z., et al. (2020). A deep learning approach to antibiotic discovery. Cell, 180(4), 688–702. [CrossRef]
- Suk, J. S., Xu, Q., Kim, N., Hanes, J., & Ensign, L. M. (2016). PEGylation as a strategy for improving nanoparticle-based drug and gene delivery. Adv Drug Deliv Rev, 99(Pt A), 28–51. [CrossRef]
- van Tilborg, D., Brinkmann, H., Criscuolo, E., Rossen, L., Özçelik, R., & Grisoni, F. (2024). Deep learning for low-data drug discovery: hurdles and opportunities. Curr Opin Struct Biol, 86, 102818. [CrossRef]
- Vandeputte, O. M., Kiendrebeogo, M., Rajaonson, S., Diallo, B., Mol, A., El Jaziri, M., & Baucher, M. (2010). Identification of catechin as one of the flavonoids from Combretum albiflorum bark extract that reduces the production of quorum-sensing-controlled virulence factors in Pseudomonas aeruginosa PAO1. Appl Environ Microbiol, 76(1), 243–253. [CrossRef]
- Vitaku, E., Smith, D. T., & Njardarson, J. T. (2014). Analysis of the structural diversity, substitution patterns, and frequency of nitrogen heterocycles among U.S. FDA approved pharmaceuticals. J Med Chem, 57(24), 10257–10274. [CrossRef]
- Wöhrle, T., Wurzbach, I., Kirres, J., Kostidou, A., Kapernaum, N., Litterscheidt, J., Haenle, J. C., Staffeld, P., Baro, A., Giesselmann, F., & Laschat, S. (2016). Discotic liquid crystals. Chemical Reviews, 116(3), 1139–1241. [CrossRef]
- Wong, F., Zheng, E. J., Valeri, J. A., Donghia, N. M., Anahtar, M. N., Omori, S., Li, A., Cubillos-Ruiz, A., Krishnan, A., Jin, W., Manson, A. L., Friedrichs, J., Helbig, R., Hajian, B., Fiejtek, D. K., Wagner, F. F., Soutter, H. H., Earl, A. M., Stokes, J. M., Renner, L. D., & Collins, J. J. (2024). Discovery of a structural class of antibiotics with explainable deep learning. Nature, 626, 177–185. [CrossRef]
| Compound | Description | P(QS) | P(Abx) | MW | LogP | HBD | Lip. | PAINS | Brenk | SA | Issue |
|---|---|---|---|---|---|---|---|---|---|---|---|
| Parent | 6×OH | 0.969 | 0.813 | 274 | 1.60 | 6 | 1 | 0 | 2 | 1.44 | Catechol+HBD |
| Lead | 4×OAc | 0.849 | 0.876 | 442 | 2.72 | 2 | 0 | 0 | 2 | 2.66 | High TPSA |
| Mod 4c | CH₂NH₂+3OMe | 0.990 | 0.234 | 329 | 2.41 | 3 | 0 | 0 | 1† | 1.98 | PAH+AMES+ |
| Mod 5 | NHMe+sat.ring | 0.964 | 0.227 | ~305 | — | — | — | — | 0‡ | — | Low P_Abx |
| M6-12 | NHMe+F+pip.NH | 0.928 | 0.792 | 340 | 0.48* | 4 | 0 | 0 | 0 | 4.50 | ✓ Final |
| Property | Custom Desc. | SwissADME | RDKit | Concordance |
|---|---|---|---|---|
| MW (Da) | 340.4 | 340.39 | 340.3950 | ✔ |
| LogP | — | 0.63 (cons.) | 0.4847 (Crippen) | ≈¹ |
| TPSA (Ų) | — | 82.98 | 82.98 | ✔ |
| HBD | 4 | 4 | 4 | ✔ |
| HBA | 7 | 7 | 6² | ≈² |
| Fraction Csp3 | 0.65 | 0.65 | 0.6471 | ✔ |
| Rotatable bonds | 4 | 4 | 4 | ✔ |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
