Cortex Analytics

Analysis that reasons about the science, then defends the result

Cortex reasons about your data, chooses methods that fit its biological and statistical structure, and assembles an executable plan. It runs in your environment, flags assumption violations as they emerge, and hands back every decision, parameter, and intermediate output that produced the result.

Plan → Run

Natural-language plans compile to executable DAGs

You describe the question. Cortex reasons about which methods fit the data, then shows you exactly what it proposes to run before a single byte is processed.

Tell Cortex what you want to investigate. It inspects the data, reviews current best practice for your modality, and proposes an analysis plan as a versioned DAG: tools, parameters, checkpoints, and dependencies, each chosen because it fits your study design. Edit any node, swap methods, inject your own code. Approve the plan and Cortex runs it in your compute environment, streaming intermediate outputs and logs as each step completes.

Parallel Execution

Parallelism fans out as dependencies unlock

Cortex agents fan out across independent steps and methods. A run that would take six hours sequentially completes in well under one, without dropping the diagnostic checks, normalization choices, or sensitivity passes a reviewer would expect. Breadth costs wall-clock time, not reasoning quality.

Each step writes back a structured synthesis as it finishes: overview, method, key numbers, caveats, citations. Reviewing a run reads like reading a paper, not tailing logs.

Parallelism cuts wall-clock time. Each step writes back its own interpretation, so reviewing a run reads like reading a paper.

parallel run4 done · 2 running · 3 queued
  • 1.Differential Expression Analysisdone
  • 2.Gene Symbol Intersectiondone
  • 3.Disease Progression Trendrunning
  • 4.Pathway Enrichment (ORA + GSEA)done
  • 5.PPI Network + Hub Genesdone
  • 6.NASH Disease Signaturerunning
  • 7.Connectivity Scoring (822 cmpds)queued
  • 8.Collapse to Best-per-Compoundqueued
  • 9.Integrated Drug Repurposingqueued
PPI Network Construction, Hub Gene and Module Detection
network-agent
synthesis
T2S2: PPI Network Construction, Co-expression Module Detection, and Druggable Hub Gene Identification in NASH vs Healthy
Overview

Three sequential analyses were performed on NASH vs healthy differential expression results from T1S1: (1) STRING protein-protein interaction network construction and hub gene identification, (2) WGCNA-style weighted co-expression network analysis on VST-normalized counts, and (3) druggability annotation of hub genes via DGIdb. All analyses used the same 57-sample GSE126848 cohort (NASH=16, NAFLD=15, healthy=14, obese=12).


1. PPI Network Construction
Method
  • Input: 1,295 unique gene symbols from the 1,306 significant DE genes (NASH vs healthy, FDR < 0.05)
  • STRING v12 API queried in batches of 400 genes; combined score threshold ≥ 700 (high confidence)
  • Undirected weighted NetworkX graph constructed from returned interactions
  • Self-loops removed; duplicate undirected edges deduplicated
  • Largest connected component (LCC) extracted for centrality analysis
Convergent Evidence

Independent lines of evidence, ranked honestly

Cortex combines orthogonal scores: connectivity (signature reversal), PPI hub coverage, pathway enrichment. Each score probes a different aspect of the biology and uses different math, so a candidate that wins across all three is robust to single-method bias, not a happy accident.

When the signal-leader and the composite-leader disagree, Cortex surfaces the disagreement, explains what each score is measuring, and shows you the trade-off. You decide what matters for your program, with the reasoning on the page, not buried in a supplementary figure.

A compound that wins the composite isn't always the one with the loudest single signal, and the difference is what makes the ranking defensible.

convergent rankingtop 2 of 6 actionable candidates
#1METFORMIN
composite0.703

  • Reversalrank 16/128−0.272
  • PPI hubs6 hubsCDKN1A, CSF1R, FGF2, MMP9, SERPINE1, VCAM1
  • Pathways3 hitsTP53, AMPK, Hippo
#2THEOPHYLLINE
composite0.684

  • Reversalrank 1/128−0.313← strongest single
  • PPI hubs2 hubsCDKN1A, SERPINE1
  • Pathways3 hitsGPCR, p53, caffeine

Theophylline wins raw reversal. Metformin wins composite: it converges across more orthogonal scores.

3 orthogonal scores·Connectivity · PPI hubs · Enrichment
Defensible by design

Every artefact has a paper trail you can verify

When a reviewer asks 'where did this number come from?', the answer is a graph, a hash, and a rerun.

Every step logs its code, inputs, outputs, and runtime environment. Every artefact carries a content hash and a run UUID. Lineage shows you how any figure was produced; the audit log proves it hasn't changed since. Byte-identical reruns six months later are demonstrable, not promised.

Your Environment

Runs where your data lives

Cortex deploys in your cloud or on-premise. Your data never leaves your tenancy. All code, all intermediate outputs, and all logs are stored in infrastructure you control, with your IAM, your encryption keys, and your audit trail.

Agents call LLMs through a privacy-mode gateway that redacts identifiers and prevents model-training retention. Full documentation is available for your security review.

BYOC deployment, SOC 2 compliant, no model training on your data.

deploymentSOC 2 Type II
  • Your cloud (BYOC)
    AWS · GCP · Azure
  • On-premise
    Air-gapped supported
  • LLM gateway
    Privacy mode · PII redaction

Analysis breadth, not just depth

Twelve specialist agents span the modalities pharma teams actually work with. Each reasons about method fit for your data, applies cross-method concordance, and documents the decisions behind the result.

Transcriptomics & expression
bulk-transcriptomics

Bulk RNA-seq

Picks a differential-expression model that fits sample size and dispersion, reconciles methods for concordance, and flags batch effects.

DESeq2edgeRlimma-voom
spatial-omics

Spatial transcriptomics

Tissue-registered expression and neighborhood analysis across major platforms, with statistics that respect spatial autocorrelation.

VisiumXeniumMERFISH
enrichment

Pathway & TF activity

Infers pathway and transcription-factor activity with uncertainty bounds, reconciling hallmark, Reactome, and TF views of the same response.

GSEAdecouplerORA
Epigenomics & genomics
chromatin

Chromatin accessibility

Calls peaks, scans motifs, and tests differential accessibility while controlling for open-chromatin confounding.

MACS2HOMERChromVAR
dna-methylation

DNA methylation

Detects differentially methylated regions and estimates epigenetic age, handling probe-level and bisulfite-conversion biases.

minfimethylKitDMRcate
genomic-variant

Variant analysis

Calls germline and somatic variants, annotates functional consequences, and prioritizes clinically actionable calls.

GATKDeepVariantVEP
Proteomics, metabolomics & microbiome
proteomics

Proteomics

Quantifies label-free and TMT data with FDR control, tests differential abundance, and dissects PTM-level changes.

MaxQuantMSstatslimma
metabolomics

Metabolomics

Aligns LC-MS features, resolves identifications against reference databases, and maps hits to pathway context.

XCMSMetaboAnalystmummichog
microbiome

Microbiome

Profiles 16S and shotgun reads, handles compositionality, and reports diversity, taxonomy, and functional pathways.

QIIME 2DADA2MetaPhlAn
Chemistry, integration & modeling
cheminformatics

Cheminformatics

QSAR, similarity and scaffold analysis, and ADMET inference, anchored to internal libraries and ChEMBL.

RDKitChEMBLQSAR
multi-omics-integration

Multi-omics integration

Factor models and joint embeddings that separate shared biology from modality-specific signal.

MOFA+DIABLOMuVI
statistical-modeling

Survival & biomarker modeling

Fits Cox, Kaplan–Meier, and random-survival-forest models with calibrated, cohort-validated predictions.

survivalscikit-survivallifelines

See Cortex analyze your data

Bring a dataset. We'll show you what the pipeline, the provenance, and the defended result look like.