Analysis that reasons about the science, then defends the result
Cortex reasons about your data, chooses methods that fit its biological and statistical structure, and assembles an executable plan. It runs in your environment, flags assumption violations as they emerge, and hands back every decision, parameter, and intermediate output that produced the result.
Natural-language plans compile to executable DAGs
You describe the question. Cortex reasons about which methods fit the data, then shows you exactly what it proposes to run before a single byte is processed.
Tell Cortex what you want to investigate. It inspects the data, reviews current best practice for your modality, and proposes an analysis plan as a versioned DAG: tools, parameters, checkpoints, and dependencies, each chosen because it fits your study design. Edit any node, swap methods, inject your own code. Approve the plan and Cortex runs it in your compute environment, streaming intermediate outputs and logs as each step completes.
Parallelism fans out as dependencies unlock
Cortex agents fan out across independent steps and methods. A run that would take six hours sequentially completes in well under one, without dropping the diagnostic checks, normalization choices, or sensitivity passes a reviewer would expect. Breadth costs wall-clock time, not reasoning quality.
Each step writes back a structured synthesis as it finishes: overview, method, key numbers, caveats, citations. Reviewing a run reads like reading a paper, not tailing logs.
Parallelism cuts wall-clock time. Each step writes back its own interpretation, so reviewing a run reads like reading a paper.
- 1.Differential Expression Analysisdone
- 2.Gene Symbol Intersectiondone
- 3.Disease Progression Trendrunning
- 4.Pathway Enrichment (ORA + GSEA)done
- 5.PPI Network + Hub Genesdone
- 6.NASH Disease Signaturerunning
- 7.Connectivity Scoring (822 cmpds)queued
- 8.Collapse to Best-per-Compoundqueued
- 9.Integrated Drug Repurposingqueued
Three sequential analyses were performed on NASH vs healthy differential expression results from T1S1: (1) STRING protein-protein interaction network construction and hub gene identification, (2) WGCNA-style weighted co-expression network analysis on VST-normalized counts, and (3) druggability annotation of hub genes via DGIdb. All analyses used the same 57-sample GSE126848 cohort (NASH=16, NAFLD=15, healthy=14, obese=12).
- Input: 1,295 unique gene symbols from the 1,306 significant DE genes (NASH vs healthy, FDR < 0.05)
- STRING v12 API queried in batches of 400 genes; combined score threshold ≥ 700 (high confidence)
- Undirected weighted NetworkX graph constructed from returned interactions
- Self-loops removed; duplicate undirected edges deduplicated
- Largest connected component (LCC) extracted for centrality analysis
Independent lines of evidence, ranked honestly
Cortex combines orthogonal scores: connectivity (signature reversal), PPI hub coverage, pathway enrichment. Each score probes a different aspect of the biology and uses different math, so a candidate that wins across all three is robust to single-method bias, not a happy accident.
When the signal-leader and the composite-leader disagree, Cortex surfaces the disagreement, explains what each score is measuring, and shows you the trade-off. You decide what matters for your program, with the reasoning on the page, not buried in a supplementary figure.
A compound that wins the composite isn't always the one with the loudest single signal, and the difference is what makes the ranking defensible.
- Reversalrank 16/128−0.272
- PPI hubs6 hubsCDKN1A, CSF1R, FGF2, MMP9, SERPINE1, VCAM1
- Pathways3 hitsTP53, AMPK, Hippo
- Reversalrank 1/128−0.313← strongest single
- PPI hubs2 hubsCDKN1A, SERPINE1
- Pathways3 hitsGPCR, p53, caffeine
Theophylline wins raw reversal. Metformin wins composite: it converges across more orthogonal scores.
Every artefact has a paper trail you can verify
When a reviewer asks 'where did this number come from?', the answer is a graph, a hash, and a rerun.
Every step logs its code, inputs, outputs, and runtime environment. Every artefact carries a content hash and a run UUID. Lineage shows you how any figure was produced; the audit log proves it hasn't changed since. Byte-identical reruns six months later are demonstrable, not promised.
- recommendation_summary.jsonT4S2/output4/19/26, 9:02:51 PMSyncedService7443274f
- final_recommendation.mdT4S2/output4/19/26, 9:02:51 PMSyncedService1eae4274
- summary_dashboard.pngT4S2/figures4/19/26, 9:02:51 PMSyncedService408b5e0d
- summary_dashboard.pdfT4S2/figures4/19/26, 9:02:51 PMSyncedServicee55ae0b4
- sar_structural_path.pdfT4S2/figures4/19/26, 9:02:51 PMSyncedService377a2c47
- molecule_grid_top_candidates.pngT4S2/figures4/19/26, 9:02:51 PMSyncedService28331cb7
- scaffold_ranking.csvT4S2/output4/19/26, 9:02:51 PMSyncedService492baa68
- comparison_table.csvT4S2/output4/19/26, 9:02:51 PMSyncedService47498e57
- integrated_recommendation_report.pyT4S2/scripts4/19/26, 9:02:51 PMSyncedServicef098653c
- Timestamp
- 4/19/26, 9:02:51 PM
- Action
- Synced
- Artifact
- molecule_grid_top_candidates.png
- Actor
- Service
- Size
- 91 KB
- Step
- T4S2
- Run
- 6382f44e-bcb3-464b-8c0b-ebee964415eb
- Synced
- ✓ Yes
- file id
- 019da6e8-d969-7eb4-9657-aabb1388d9…
Runs where your data lives
Cortex deploys in your cloud or on-premise. Your data never leaves your tenancy. All code, all intermediate outputs, and all logs are stored in infrastructure you control, with your IAM, your encryption keys, and your audit trail.
Agents call LLMs through a privacy-mode gateway that redacts identifiers and prevents model-training retention. Full documentation is available for your security review.
BYOC deployment, SOC 2 compliant, no model training on your data.
- Your cloud (BYOC)AWS · GCP · Azure
- On-premiseAir-gapped supported
- LLM gatewayPrivacy mode · PII redaction
Analysis breadth, not just depth
Twelve specialist agents span the modalities pharma teams actually work with. Each reasons about method fit for your data, applies cross-method concordance, and documents the decisions behind the result.
Bulk RNA-seq
Picks a differential-expression model that fits sample size and dispersion, reconciles methods for concordance, and flags batch effects.
Spatial transcriptomics
Tissue-registered expression and neighborhood analysis across major platforms, with statistics that respect spatial autocorrelation.
Pathway & TF activity
Infers pathway and transcription-factor activity with uncertainty bounds, reconciling hallmark, Reactome, and TF views of the same response.
Chromatin accessibility
Calls peaks, scans motifs, and tests differential accessibility while controlling for open-chromatin confounding.
DNA methylation
Detects differentially methylated regions and estimates epigenetic age, handling probe-level and bisulfite-conversion biases.
Variant analysis
Calls germline and somatic variants, annotates functional consequences, and prioritizes clinically actionable calls.
Proteomics
Quantifies label-free and TMT data with FDR control, tests differential abundance, and dissects PTM-level changes.
Metabolomics
Aligns LC-MS features, resolves identifications against reference databases, and maps hits to pathway context.
Microbiome
Profiles 16S and shotgun reads, handles compositionality, and reports diversity, taxonomy, and functional pathways.
Cheminformatics
QSAR, similarity and scaffold analysis, and ADMET inference, anchored to internal libraries and ChEMBL.
Multi-omics integration
Factor models and joint embeddings that separate shared biology from modality-specific signal.
Survival & biomarker modeling
Fits Cox, Kaplan–Meier, and random-survival-forest models with calibrated, cohort-validated predictions.
See Cortex analyze your data
Bring a dataset. We'll show you what the pipeline, the provenance, and the defended result look like.