NEXUS on SageMaker JumpStart: Large Tabular Model for Enterprise Predictive AI

NEXUS: A foundation model that finally speaks tabular

TL;DR

  • NEXUS is a Large Tabular Model (LTM) pre-trained on billions of prediction tasks and over 10 billion table rows to predict directly from tables — numbers, categories, dates and short text — with reproducible outputs.
  • Available via Amazon SageMaker JumpStart and AWS Marketplace, it runs in a single‑tenant, network‑isolated SageMaker instance (ml.p5en.48xlarge with 8× NVIDIA H200 GPUs) with data kept in your S3 buckets.
  • Use it to shorten feature engineering cycles and accelerate production for fraud detection, credit scoring, clinical matching, predictive maintenance and demand forecasting — but pilot, benchmark, and pair with governance before you flip the switch.

Why tabular data deserves a specialist

Enterprises live in tables: ERPs, CRMs, billing systems and sensor lakes. That structured data carries the most direct signals for business outcomes, yet turning it into reliable predictions usually means months of feature engineering, fragile ETL, and bespoke models. Language‑first foundation models changed how we handle text; Large Tabular Models aim to do the same for rows and columns.

Think of NEXUS as a clinician trained on decades of patient charts rather than a general practitioner. It’s built to interpret table structures reliably and repeatedly — something probabilistic LLMs struggle with because they were not designed for precise numeric and categorical reasoning.

What NEXUS is — plain language

NEXUS is a foundation model purpose-built for tabular data prediction. It’s pre-trained on billions of real-world prediction tasks and more than 10 billion rows, giving it broad experience across many business datasets so teams can skip much of the manual feature work that typically slows projects.

  • Model type: Large Tabular Model (LTM) optimized for structured data.
  • Scale: Pre-trained on billions of tasks and 10+ billion table rows.
  • How to access: Amazon SageMaker JumpStart and AWS Marketplace.
  • Runtime: Single‑tenant, network‑isolated ml.p5en.48xlarge instance (8× NVIDIA H200 GPUs) on SageMaker.
  • SDK: Fundamental Python SDK exposing NEXUSClassifier and NEXUSRegressor with a scikit‑learn–style API (fit/predict/predict_proba).
  • Data residency: Data and outputs remain in your AWS account (S3) and inference happens inside the managed SageMaker endpoint.
  • Operational features: Deterministic predictions, permutation invariance, cross‑schema reasoning, billion‑row scale, autonomous data cleaning, and support for continuous retraining via SageMaker Pipelines.

NEXUS is presented as a foundation model built for tabular data prediction — pre-trained to detect patterns in structured datasets, reducing weeks or months of feature engineering.

How it works — explained for data teams and execs

Four architectural traits matter for business teams:

  • Deterministic predictions: Given the same inputs you get the same outputs — crucial for audits and reproducible decisioning.
  • Permutation invariance: Column or row order won’t break reasoning, so you don’t need to redesign tables for each dataset.
  • Cross‑schema reasoning: The model can transfer patterns learned from one table shape to another, reducing custom rework across business units.
  • Autonomous data cleaning: Built‑in handling for missing values, common unit mismatches and simple normalizations helps teams move faster.

These choices make NEXUS less about reinventing feature pipelines and more about applying a powerful, pre-trained predictor quickly against your S3 datasets inside a managed, secure environment.

Deployment on SageMaker JumpStart and AWS Marketplace

Deployment is designed to preserve enterprise controls. NEXUS runs in a single‑tenant, network‑isolated ml.p5en.48xlarge instance (8× NVIDIA H200) on SageMaker. Your data stays in your S3 buckets, and model operations are executed inside your AWS environment via managed endpoints. For many organizations, that combination — a specialized tabular model plus cloud‑native governance — is the on‑ramp they’ve been waiting for.

Deployment on SageMaker preserves customer data inside their AWS environment via a network‑isolated, single‑tenant instance, and the Fundamental SDK exposes scikit‑learn–style workflows for familiar developer ergonomics.

Developer ergonomics — quick example

The Fundamental SDK exposes a scikit‑learn–style API. A minimal example (pseudocode) looks like:

from fundamental import NEXUSClassifier
clf = NEXUSClassifier(model_variant='base')
clf.fit(s3_input_path='s3://your-bucket/train.csv')
preds = clf.predict(s3_input_path='s3://your-bucket/test.csv')
probs = clf.predict_proba(s3_input_path='s3://your-bucket/test.csv')

This familiar surface helps data scientists run fit/predict cycles without rebuilding entire pipelines — while keeping data inside S3 and compute inside a managed SageMaker endpoint.

Practical use cases — what teams can try first

  • Fraud detection: Reduce feature engineering cycles and produce repeatable scoring for investigations and regulatory reporting.
  • Credit risk / underwriting: Deterministic outputs help with audit trails and consistent decisioning.
  • Clinical trial matching: Precise matching across many structured fields speeds enrollment and compliance.
  • Predictive maintenance: Scale to billions of sensor rows to predict equipment failures and schedule interventions.
  • Demand forecasting & pricing: Cross‑schema reasoning helps transfer learnings across product lines and markets.

Mini vignette: a payments team with a complex fraud stack can run NEXUS as a shadow model for 4–6 weeks, compare false positive rates and chargeback reduction, then operationalize if the deterministic outputs and audit logs meet compliance needs.

Pilot & evaluation checklist

Run a focused pilot before full migration. A 4–6 week pilot structure works well:

  1. Select 1–2 high‑value use cases with clear KPIs (e.g., reduce false positives, improve precision@k, reduce chargebacks).
  2. Prepare a sanitized dataset in S3 and split it for backtesting + live shadow testing.
  3. Run parallel A/B tests: NEXUS vs your best existing model. Measure accuracy, calibration, latency, and business metrics.
  4. Test explainability: collect feature impact, per‑prediction logs, and run interpretability audits on edge cases.
  5. Assess cost: log GPU hours, endpoint utilization, and compute required per 1M predictions.
  6. Document governance artifacts: model card, versioning, input/output lineage and retraining triggers.

Key metrics to capture during the pilot:

  • AUC/ROC, precision@k, recall, false positive rate
  • Calibration and confidence reliability
  • End‑to‑end latency and throughput
  • Cost per 1,000 / 1M predictions (inference + storage + orchestration)
  • Retraining time and engineering hours saved vs custom pipelines

Governance, explainability and risks

Foundation models change the tradeoffs: you gain speed but need explicit guardrails. Recommended artifacts and tooling:

  • Model card & versioning: Document training scope, known limitations and intended uses.
  • Lineage & provenance: Capture S3 input locations, pre‑processing steps, SDK version and endpoint logs.
  • Explainability: Use local surrogate explanations, SHAP/feature attributions, and example‑based explanations for high‑impact decisions.
  • Monitoring: Use SageMaker Model Monitor for drift, latency and data quality alerts; integrate fairness and bias checks periodically.
  • Audit trails: Maintain deterministic prediction logs and model hashes for regulatory review.

Deterministic predictions matter where consistency is required: credit scoring, fraud flags and clinical matching demand reproducible outputs and an auditable trail.

Cost & TCO considerations

Running on ml.p5en.48xlarge (8× NVIDIA H200) is high‑compute and carries meaningful GPU costs for sustained workloads. Practical cost tactics:

  • Shadow and batch inference: Run in shadow mode for validation and use batched inference to amortize GPU usage.
  • Endpoint multiplexing: Host multiple fit/predict jobs on one endpoint when workloads are intermittent.
  • Right‑size validation: Use smaller instances for initial experiments and the full H200 stack only for production or large retraining runs.
  • Spot or scheduled retraining: Use spot instances or scheduled windows for non‑critical retraining jobs to reduce costs.
  • TCO checklist: Add engineering time saved from reduced feature engineering and faster time‑to‑value when comparing to bespoke stacks.

Estimate TCO by combining GPU hourly rates, storage and data transfer, plus opportunity cost saved in faster deployments. For many high‑value use cases, the productivity gain can offset compute costs — but validate with a pilot.

When not to use NEXUS

NEXUS is not a silver bullet. Consider alternatives when:

  • You need causal inference or bespoke time‑series/forecasting architectures that embed domain causal graphs.
  • Your compliance posture requires an explainability level that current tooling cannot satisfy.
  • Prediction volumes are tiny and the overhead of H200 instances outweighs benefits — a simple XGBoost or light model may be cheaper and faster.
  • Your team’s investment is in highly customized, proprietary features tied to specific domain laws of physics or finance where transfer learning offers limited benefit.

Questions you should answer before production

  • How transparent and explainable are NEXUS predictions for audits and regulators?

    Explainability depends on the governance and tooling you layer around NEXUS; implement per‑prediction logging, surrogate explanations and model cards to satisfy audits, and validate industry variants on your data.

  • What are the pricing and TCO implications of running ml.p5en.48xlarge instances?

    GPU costs for 8× H200 instances are significant for sustained workloads. Reduce cost with batched inference, endpoint multiplexing, spot retraining and right‑sizing experiments to estimate real TCO versus current stacks.

  • How should organizations validate and benchmark NEXUS against existing models?

    Run parallel A/B tests on historical and live data, measure business KPIs and calibration, track latency and cost per prediction, and include compliance checks before migrating to production.

  • What governance, bias detection and data lineage tooling is required?

    Expect to integrate SageMaker Model Monitor, model cards, lineage logging, fairness checks and periodic drift/bias audits. Vendor tooling plus internal MLOps processes will be necessary.

  • Are there limitations on table schemas or causal tasks?

    NEXUS excels at structured prediction and cross‑schema transfer, but causal inference and some time‑series causal tasks may still need specialist modeling or domain‑specific layers.

Next steps for teams

Pilot NEXUS on a single high‑value use case for 4–6 weeks: prepare an S3 dataset, run a shadow A/B test against your best model, capture the metrics listed above, and build the governance artifacts. If accuracy, reproducibility and TCO look favorable, scale to additional workflows while retaining strict monitoring and version control.

Fundamental’s NEXUS and AWS’ SageMaker integration represent a pragmatic path to accelerate AI for business — reducing undifferentiated engineering and giving teams a managed route to production. For enterprise leaders, the immediate decision is tactical: pick a measurable pilot, validate ROI, and harden governance before broader rollout.

Contributors to the SageMaker launch and guidance include Vivek Gangasani, Hazim Qudah and Jimmy Shah from AWS, reflecting the close partnership operationalizing tabular AI for enterprise customers.