The Nanolix molecular generation engine.
From target constraints to ranked candidates. A computational pipeline built for medicinal chemistry teams who need candidates that can actually be made — not just candidates that score well in silico.
Target constraints in. Ranked candidates out.
Property prediction models run in parallel with the generative engine — evaluating each candidate across 11 ADMET dimensions simultaneously, not after generation is complete. That co-optimization step is what separates useful candidate sets from high-scoring-but-unsynthesizable structures.
11 ADMET properties. Predicted before synthesis.
Ensemble prediction with calibrated confidence
Binding affinity prediction combines docking score rescoring with a graph neural network ensemble trained on ChEMBL activity data. The GNN encoder captures molecular topology — not fingerprints — giving better performance on novel scaffold classes.
ADMET models use a multi-task architecture that learns property correlations jointly. This matters: a model that learns solubility and permeability together captures the physical relationship between them. Training data spans ~2.1M compounds from ChEMBL plus proprietary assay partnerships.
Every prediction includes a calibrated confidence interval. We report uncertainty honestly — wide intervals tell you where to focus the first synthesis round rather than falsely narrowing the decision space.
Full methodology details
We navigate, not enumerate
The generative engine explores via gradient-guided sampling in a learned latent chemical space. It is not enumerative — it navigates toward multi-property optima rather than cataloging exhaustive libraries. This is the key difference from virtual screening: we sample regions where the gradient of the property landscape points, not regions where enumeration is convenient.
A variational autoencoder maps molecules to a continuous latent space. Navigation in that space is guided by multi-property gradients — the engine moves toward regions expected to satisfy all your constraints simultaneously, not just binding affinity.
Synthetic accessibility is a first-class constraint during navigation — not a post-generation filter. Candidates are scored on SA throughout the sampling process, so the output skews toward structures your CRO can actually quote.
What you receive at delivery
| rank | smiles | pIC50 | sol_uM | hERG | SA_score | CI_width |
|---|---|---|---|---|---|---|
| 001 | CC1CC(N…)C(=O) | 8.42 | 142 | 0.08 | 2.1 | ±0.31 |
| 002 | COc1cc(C…)nc2 | 8.19 | 89 | 0.12 | 2.4 | ±0.28 |
| 003 | FC(F)(F)c1cccc | 8.11 | 204 | 0.06 | 1.9 | ±0.42 |
| 004 | O=C(Nc1cccc)n2 | 7.98 | 61 | 0.21 | 2.7 | ±0.35 |
| 005 | Cc1ccc(F)cc1NC | 7.84 | 317 | 0.05 | 1.7 | ±0.29 |
pIC50 = predicted binding; sol_uM = aqueous solubility; hERG = inhibition probability; SA_score = synthetic accessibility (lower = more accessible); CI_width = prediction confidence interval
Fits into Maestro, LIMS, and CRO handoff without reformatting
All output SDF files are formatted for direct import into Maestro for visualization, docking validation, and further analysis. No conversion step required.
Program-tier engagements include REST API delivery. Output schema configurable to match your LIMS ingest format. JSON or CSV endpoint per your platform requirements.
Synthesis routes are pre-checked against Enamine REAL Space and WuXi AppTec standard catalog reagent availability. Candidates your CRO can quote without custom reagent sourcing.
Run a sample against your target.
Give us your target constraints. We'll return a sample candidate set with predicted properties within 5 business days — before any contract. 30-minute call to align on parameters, then we run the generation.
Request a Target Briefing