← Back to overview
Research

Hard problems

To replace Oracle and SAP, you have to build something that has never existed. Not incrementally better software. A fundamentally different kind of system. One that doesn't just record the state of the business, but comprehends it, predicts it, simulates it, and acts on it.

The first seven problems are infrastructure: the systems you have to build before intelligence is possible. The next eight are the intelligence problems themselves. Together they constitute the full CENTARI stack.

Score: 10/10 = solved in production. 5/10 = approach proven, hard work ahead. Lower = open research.
Problems 1 — 7

Infrastructure

The substrate. These are the systems you have to build before intelligence is possible. Mostly solved, but the hard edges remain.

[01]

Composable Data Platform

10/10

A hybrid transactional-analytical database that adapts its physical schema per tenant, supports real-time OLAP without ETL, and dual-writes every mutation as cell-level AI training events.

Why it's hard
*Physicalized adaptive schema. Every tenant gets real Postgres tables with real indexes, not an EAV store. Schema changes must propagate without downtime.
*HTAP without compromise. Transactional writes and analytical reads on the same live data, no ETL pipeline, no sync lag.
*Cell-level event generation. Every mutation becomes a training event for CENTARI without degrading OLTP performance.
This is the substrate that makes cell-level AI training possible without impacting OLTP performance. No other enterprise system has this architecture.
[02]

Agentic HTAP

8/10

ZebraDB.ai. A database that agents can clone, build on, test against, and deploy to production through the same typed API that humans use.

Why it's hard
*Agent-safe DDL. Agents need to propose schema changes that are validated, sandboxed, and reversible before touching production.
*Dual-write CENTARI Data Log. Every mutation becomes an EAV-formatted training event in real time.
*MCP server interface. SQL, RLS/FLS, clone/branch/merge, all exposed as typed tool calls.
The database is the foundation of every enterprise system. Making it agent-native means agents can build and operate applications end-to-end.
[03]

Hi-TX Durable Execution

8/10

A workflow engine where every execution is durable, retriable, and observable, and every step emits an OCEL 2.0 event that feeds the foundation models.

Why it's hard
*Durable execution at enterprise scale. Thousands of concurrent workflows with automatic state persistence and exactly-once semantics.
*OCEL 2.0 event emission. Every workflow step generates a structured event. The orchestration layer is simultaneously an event stream generator.
*Human-agent composition. The same workflow handles human approvals and agent actions through the same typed interface.
The workflow engine is the nervous system. Without durable execution, you can't close the loop between intelligence and action.
[04]

Meta-Ledgers

8/10

Everything is a ledger. Every state change, every transaction, every decision, recorded in append-only ledgers that serve as both audit trail and training corpus.

Why it's hard
*Append-only at scale. Ledgers grow fast. Multi-tenant ledger storage with efficient querying across billions of entries.
*Causal context. Every ledger entry must capture why the change happened, which workflow, which trigger, which upstream event.
*Regulatory compliance. Financial ledgers must satisfy SOX, GAAP, and industry-specific audit requirements natively.
Ledgers are the bridge between auditability and intelligence. The CENTARI Data Log, the compliance layer, and the training pipeline all flow from the same append-only substrate.
[05]

ZSL: System of Schema

7/10

Terraform for enterprise applications. A single declaration fans out to data, logic, and applications simultaneously. The configbase IS the codebase.

Why it's hard
*Full type system with static analysis. Parser, type-checker, and compiler catch errors before deployment.
*Plan/Apply semantics. Like Terraform: preview what will change before deploying. Deterministic plans with diff output.
*E2E deployment. A single zsl apply deploys schemas to ZebraDB, workflows to ZFlow, and permissions to the auth layer.
ZSL is what makes the platform self-extending. New applications are declarations, not engineering projects. This is the Type 2 system realized.
[06]

CI/CD of the Configbase

7/10

The Agentic Application SDLC. The configbase has a filesystem (ZSL files), a compiler (static analysis), a test suite (generated UTs), a deployment pipeline (topological migration), and version control (branching/merging).

Why it's hard
*Topological migration ordering. Schema changes must be applied in dependency order across tables, workflows, and permissions.
*Generated test suites. Every ZSL change automatically generates unit tests that validate the migration path.
*Branch/merge semantics for enterprise config. Two teams editing different parts of the same application need to merge without breaking each other.
This is what makes continuous deployment safe for enterprise applications. Every change is validated, tested, and reversible before it touches production.
[07]

Context Christmas Tree

7/10

Semantic cache invalidation. The context tree grounds foundation model predictions in the actual state of the business, not stale snapshots.

Why it's hard
*Semantic cache invalidation is a SOTA hard problem. When a price changes, which predictions are stale? The dependency graph is dynamic and deep.
*Real-time context assembly. Every prediction needs fresh context from ZebraDB, ZFlow state, and upstream events, assembled in milliseconds.
*Cross-entity context. A demand forecast for Product A depends on Supplier B's lead times, Warehouse C's capacity, and Customer D's order patterns.
The tree is what grounds CENTARI's predictions in the actual state of the business. Without it, the models are reasoning over stale data.
Problems 8 — 15

Intelligence

The frontier. Foundation models, simulation engines, training pipelines, and human-agent parity. The problems that turn infrastructure into intelligence.

[08]

Cell-Level Tokenization

7/10

The CENTARI Data Log. Every mutation to a Z-Table is simultaneously written as an append-only cell-change event with causal context, bridging the system of record and the foundation models.

Why it's hard
*Dual-write without degrading OLTP performance. The hot path can't slow down.
*Causal context capture. Every cell change needs to know why it happened: which workflow, which trigger, which upstream event.
*Schema evolution. When ZSL changes the schema, the Data Log format has to adapt without breaking the training pipeline.
*Volume. A mid-market customer doing 10K transactions/day generates millions of cell-change events/month.
This is the architectural innovation that makes CENTARI practical. Without it, you need batch ETL to move data from the system of record to the training pipeline. The Data Log makes every operational transaction a training event in real time, with zero ETL.
[09]

Foundation Models for Enterprise Data

5/10

The Large Tabular Model. A relational foundation model trained on cell-change events with graph-aware attention, producing predictions about the state of the enterprise.

Why it's hard
*Schema transfer. The model has to generalize across tenants with different schemas. It needs to learn relational structure, not field names.
*Graph-aware attention. Attention masks derived from schema topology (PK/FK relationships, formula dependencies). Novel architecture with no off-the-shelf implementation.
*Cross-tenant transfer. Every new customer should benefit from what the model learned on previous customers. Proving this works at scale is open research.
*Cold start. A new customer with no historical data still needs predictions.
The LTM is what turns DOSS from a system of record into a system of intelligence. Collections risk, stockout risk, vendor reliability, demand signals, all out of the box.
[10]

Enterprise World Modeling

4/10

The Large Enterprise System Model. A dynamics model that answers 'what happens next?' and 'what happens if?' by simulating the future state of the enterprise.

Why it's hard
*No benchmark exists. There is no standard evaluation framework for enterprise dynamics prediction or process simulation.
*Event sequence complexity. Enterprise processes involve branching, parallel paths, exception handling, and cross-entity interactions.
*Neuro-symbolic fusion. The model must be regularized by business constraints: accounting identities, inventory conservation, process ordering rules.
*Long-horizon rollout. Simulating 30/60/90 days of operations compounds errors through interconnected processes.
Oracle can simulate within your four walls. CENTARI can simulate across your entire supply chain. 'If we change our reorder point to 500 units, what happens to fill rate?' No ERP on earth can answer that today.
[11]

Neuro-Symbolic Fusion

5/10

DDM as physics. Encoding business logic as machine-readable symbolic constraints that regularize neural training and guarantee constraint satisfaction at inference.

Why it's hard
*Differentiable constraint enforcement. Constraints must provide gradient signals during training AND hard enforcement at inference.
*Constraint diversity. Business rules range from simple (field X must be positive) to arbitrarily complex multi-condition approval chains.
*Customer-specific constraints. Every tenant has different business rules. The constraint layer is dynamically composed per tenant.
This is what makes CENTARI predictions trustworthy. Without symbolic constraints, predictions can violate business rules. With DDM as physics, predictions are guaranteed to be feasible. The auditability layer enterprise buyers require.
[12]

Cross-Tenant Transfer

4/10

The schema diversity moat. Foundation models that improve with every new customer because each tenant's ZSL schema contributes relational patterns the model has never seen.

Why it's hard
*Schema alignment. Different tenants model the same real-world concepts differently. The model must recognize structural similarity despite surface differences.
*Privacy boundaries. Tenant A's data can't leak into Tenant B's predictions. Federated and DP approaches are active research, none mature.
*Scaling law uncertainty. No Chinchilla-style compute-optimal curves exist for relational foundation models.
This is the data network effect. Every customer makes the model better for every other customer. Years of production deployments across diverse industries to accumulate. A startup on synthetic data can't replicate it.
[13]

Human-Agent Parity

6/10

The Centaur Runtime. AI agents operate through the exact same interfaces, constraints, permissions, and audit trails as human users. Same path, same rules, same accountability.

Why it's hard
*Agent harness design. Typed function call interfaces auto-generated from ZSL, constrained by the same permission model that applies to humans.
*Explanation and reversibility. Agent actions need rationale (grounded in predictions and simulations) and must be reversible through standard workflows.
*Trust calibration. The business defines trust boundaries: what agents do independently, what requires human approval, what is off-limits.
*Self-extending. When new ZSL declarations create new entity types, the agent automatically gains the ability to operate on them.
The entire value of CENTARI collapses if predictions can't become actions. Palantir can predict but can't act (wrong layer). Oracle can act but can't predict (wrong architecture). DOSS does both through the same typed runtime.
[14]

Self-Building Applications

7/10

Given a ZSL declaration, the platform generates the complete application: data model, UI, dashboards, APIs, workflows, validation, agent harnesses. A new module is a file, not a 6-month project.

Why it's hard
*UI generation quality. The gap between 'technically correct CRUD form' and 'application humans want to use' is enormous.
*Workflow composition. New workflows must compose with existing ones. Composition rules must be derivable from ZSL relationships.
*Incremental evolution. Schema changes must update generated applications without breaking existing customizations.
This is what makes DOSS scalable. If every new module requires 6 months of custom engineering, you can't build 'multiple companies worth of products into one platform' fast enough. ZSL-to-application generation is the multiplier.
[15]

Network Intelligence

3/10

EDI-NET. A data network that creates cross-entity, cross-company intelligence. Every trading partner adds nodes and edges to the CENTARI graph, enabling supply chain simulation across company boundaries.

Why it's hard
*Network bootstrapping. The chicken-and-egg problem: retailers won't join without suppliers, suppliers won't join without retailers.
*Data normalization. EDI is a format from the 1970s. Different trading partners use different standards, qualifiers, and interpretations.
*Cross-entity privacy. Network intelligence must provide cross-network insights without leaking individual transaction data.
*Value capture. Free connectivity gets partners in the door. Real value requires enough network density to be meaningful.
EDI-NET is the moat that grows with the network. It's also what makes CENTARI's supply chain simulations cross-company rather than single-company. A qualitatively different product.
Closing

The compound bet

Every hard problem on this list is individually daunting. Most companies would pick one and spend a decade on it. Oracle spent forty years on #1 and #2 and never got to #8-15. Palantir started at #9 and #10 and never built #1-7.

The compound bet is that solving them together, on the same substrate, with the same declarative source of truth, creates something that is qualitatively different from solving them independently. The system of record feeds the foundation models. The foundation models inform the orchestration layer. The orchestration layer acts through the system of action. The system of schema makes the whole thing self-extending. The applications make it usable by humans and agents alike.

That's not an incremental improvement to enterprise software. It's a new kind of thing. And if it works, it replaces Oracle and SAP not by being better at their game, but by playing a different game entirely.