CORTEX

Biomarker Discovery

Find molecular signatures that distinguish patient groups from high-dimensional omics data. Cortex runs multi-method differential expression, pathway enrichment, and effect size estimation, then cross-validates findings to separate robust biomarker candidates from noise.

What You Get

Deliverables

Candidate biomarkers with multi-method consensus

Differentially expressed genes identified through multi-method analysis and ranked by statistical confidence. For the TCGA-BRCA ancestry comparison, Cortex identified 11,424 significant DEGs (FDR<0.05) from 1,084 patients. The top hits, LOC90784 (FDR 3.99e-45), CROCCL1 (FDR 6.5e-39), and CRYBB2 (FDR 1.6e-37), represent the most statistically robust ancestry-associated expression differences in breast cancer.

LOC90784-1.003.99e-45
CROCCL1+1.036.5e-39
CRYBB2+1.241.6e-37
FAM3A+0.811.2e-35
HEXDC+1.081.9e-35

Effect sizes with confidence intervals

Beyond statistical significance, Cortex computes effect sizes (Cohen's d) with confidence intervals for key markers. The immune checkpoint analysis revealed OX40/TNFRSF4 with a large effect size (d=+0.82, AA>EA), followed by PD-1, CTLA-4, and LAG-3, all elevated in African American patients. These effect sizes provide the practical significance needed to assess biomarker utility in clinical settings.

Cohen's dLAG-3CTLA-4PD-1OX40/TNFRSF400.20.40.60.81+0.54+0.58+0.65+0.82

Cross-validation and reproducibility metrics

The platform tests whether discovered signatures reproduce across subgroups. For the ancestry comparison, prognostic signature overlap was remarkably low: only 1.2% (2 shared genes out of 167), demonstrating that ancestry-specific prognostic models are needed. The null survival effect after clinical adjustment (HR 1.046, p=0.837) further shows that molecular differences don't always translate to outcome differences, a finding that's as important as discovering significant biomarkers.

NESImpaired BRCA2/PALB2Protein secretionIFNα responseOxidative phosphorylationMitochondrial translation-3-2-10123-2.17-1.93+1.97+1.98+2.08
DECISION ENABLED

Prioritize candidates for validation assays based on statistical rigor and biological plausibility. Focus resources on biomarkers with large effect sizes, biological pathway support, and demonstrated cross-validation performance.

Sample Output

Breast cancer ancestry analysis: DEGs, pathways, and immune markers

TCGA-BRCA: Ancestry Comparison (AA vs EA)1,084 patients
11,424
DEGs (FDR<0.05)
1,084
PATIENTS
1.2%
SIGNATURE OVERLAP
Volcano Plot: AA vs EA11,424 DEGs
-log₁₀(FDR)log₂ fold change01020304046-4-2024LOC90784CROCCL1CRYBB2PRSS45DDX6
Top Differentially Expressed Genesranked by FDR
Genelog2FCFDR
LOC90784-1.003.99e-45
CROCCL1+1.036.5e-39
CRYBB2+1.241.6e-37
FAM3A+0.811.2e-35
HEXDC+1.081.9e-35
NACA2+1.272.0e-35
PRSS45+1.514.2e-34
DDX6-0.561.3e-32
SNRNP70+0.746.8e-32
CDK10+0.871.2e-30
Pathway Enrichment: BasalMyo SubtypeGSEA
NESImpaired BRCA2/PALB2Protein secretionIFNα responseOxidative phosphorylationMitochondrial translation-3-2-10123-2.17-1.93+1.97+1.98+2.08
Immune Checkpoint MarkersCohen's d effect sizes
MarkerCohen's dDirection
OX40/TNFRSF4+0.82AA > EA
PD-1+0.65AA > EA
CTLA-4+0.58AA > EA
LAG-3+0.54AA > EA

Key finding: Ancestry drives significant molecular differences (11,424 DEGs) but survival effect is null after clinical adjustment (HR 1.046, p=0.837). Prognostic signature overlap between ancestries: only 1.2% (2 shared genes out of 167). This demonstrates analytical rigor; not every molecular difference translates to clinical outcomes.

ATOPIC DERMATITIS BIOMARKERS (GSE157194)

15-Gene Minimal Therapeutic Coredupilumab + cyclosporine convergence

Genes suppressed by BOTH dupilumab and cyclosporine despite different mechanisms (165 samples, 57 patients):

S100A12S100A7ACCL20IL36ASPRR2ASPRR2BSPRR2DSPRR2F
InflammatoryChemokineInterleukinBarrier
ALOX15: Pharmacodynamic Biomarker (Bidirectional)treatment-discriminating
log2FCDupilumabCyclosporine-2-1012-1.53+1.18

ALOX15 moves in opposite directions under dupilumab (-1.53) vs cyclosporine (+1.18). This bidirectional response makes it a pharmacodynamic biomarker for distinguishing treatment mechanism at the molecular level.

13-Gene LASSO Biomarker Panelpredictive model
13
GENES
r=0.428
CORRELATION
165
SAMPLES
57
PATIENTS
How It Works

Methodology

STEP 1

Quality control and normalization

RNA-seq count data undergoes quality control filtering (low-count genes, outlier samples) and normalization. For the TCGA-BRCA dataset, 1,084 patients across multiple ancestry groups were processed with standard variance-stabilizing transformation.

STEP 2

DESeq2 differential expression

Multi-method differential expression with appropriate covariates (clinical stage, subtype, age). DESeq2 identifies genes with significant expression differences between groups while controlling for confounders. The analysis produced 11,424 DEGs at FDR<0.05.

STEP 3

Multi-method consensus scoring

Results from multiple analytical approaches are compared for directional agreement. Genes with consistent results across methods receive higher confidence scores. This addresses the critical finding that analytical methods can disagree substantially.

STEP 4

Cross-validation assessment

Discovered signatures are tested for reproducibility across held-out subsets and independent subgroups. The 1.2% signature overlap between ancestries was discovered through systematic cross-validation, revealing that population-specific biomarker panels are needed.

STEP 5

Effect size estimation

Cohen's d effect sizes with confidence intervals are computed for all significant findings. This separates statistically significant but biologically trivial differences from markers with meaningful effect sizes suitable for clinical translation.

Who This Is For

Target personas

Biomarker scientist

Discover and prioritize biomarker candidates from high-dimensional omics data with full statistical rigor.

Diagnostics lead

Evaluate biomarker candidates by effect size, reproducibility, and population specificity for diagnostic panel development.

Clinical trial designer

Use ancestry-specific molecular differences to inform enrichment strategies and patient selection criteria.