Three systems.
One thesis: deterministic gates first.

Active 2024–present
One page over: Pipeline · Architecture · Daniel Manzela
Business outcomes — TNG Shopper, 2024 – present

Traction

Commercial proof for the autonomous content pipeline. 11 paying enterprise retailers across 5 countries (currently live: Spain, Portugal, Israel; historical: United States, Mexico). Raised, signed, and shipped within the first 18 months.

$500K

Raised

From top-tier angel investors. Founders-First investor base.

$118K / $223K

Collected ARR / Booked ARR

Cash-realized revenue and signed contracts from enterprise retail clients.

$1.53M

Contracted ARR pipeline

From existing live and signed enterprise clients. Conditional roll-out as deployments expand to all client locations.

Recognition

Named to top-100 retail-tech startups of 2025. Independent industry recognition for the autonomous content pipeline + multi-agent architecture.

Revenue model: clients begin with proof-of-concept on one location, expand to all locations once SEO/AEO compliance and indexing performance are validated. The Contracted ARR figure represents the full-rollout commitment from already-signed enterprise customers.

Case Study 01

Autonomous multi-agent DAG pipeline

Active 2024–present 11 enterprise tenants 5 countries

A 7-node Directed Acyclic Graph generating localized product content for 11 enterprise clients across 5 countries. Human-supervised autonomous execution. Deterministic validation fires first; the LLM activates only if the gate passes.

Problem

Enterprise clients required localized, SEO-compliant product pages across dozens of cities. Manual creation is mathematically impossible at ~10.5M product-location combinations per cycle.

Approach

A fail-closed 7-node DAG where every generative node is constrained by a dedicated RAG policy index (Google SPAM policies + E-E-A-T standards). 80% of compute is deterministic; only 20% is probabilistic. Every node boundary is audited by a JIT integrity layer.

Result

260K+ pages

400K+ impressions across 3 client properties, 68.9% average pass rate per boundary, $0.0006/PDP end-to-end. 234 managed websites across 5 countries.

Why this exists

Multi-location retailers need unique, localized product pages for every item in every city they serve. At enterprise scale, this means millions to billions of pages — each requiring localized copy, structured data, and policy-compliant content. Manual creation doesn’t get close.

100K SKUs × 500 locations = 50,000,000 pages A mid-size retailer. Hypermarkets need billions.
Scale

Validated with SEO leadership at retailers operating 10–10,000+ store locations. The content gap is six orders of magnitude beyond manual capacity.

Zero friction

No CMS integration. No developer hours. No API setup from the client. Operates as an external, headless discovery layer that activates with a single switch.

Compliance-first

Every node boundary is audited by a deterministic quality gate. Content below threshold is rejected before reaching production — fail-closed by design.

Verified evidence — Google Search Console

Aggregate results across 3 active client properties — verified via Google Search Console.

Enterprise Client A — Major European DIY Chain (Spain)
Google Search Console page indexing chart for Client A: 99.9K pages indexed in ~60 days
Page indexing
99.9K pages indexed

Hockey-stick from 0 → 99.9K in ~60 days. Autonomous pipeline output.

Google Search Console performance chart for Client A: 54.4K impressions and 614 clicks
Search performance
54.4K impressions · 614 clicks

1.1% CTR, avg position 10.9. Zero manual content creation.

Enterprise Client B — Largest Supermarket Chain (Portugal)
Google Search Console page indexing chart for Client B: 130K pages indexed
Page indexing
130K pages indexed

Steep growth curve from 0 → 130K. Largest property by page volume.

Google Search Console performance chart for Client B: 245K impressions and 2.19K clicks
Search performance
245K impressions · 2.19K clicks

0.9% CTR, avg position 8.7. Highest impression volume.

Enterprise Client C — Leading Discount Retailer (Spain)
Google Search Console page indexing chart for Client C: 30.2K pages indexed
Page indexing
30.2K pages indexed

Rapid ramp from 0 → 30.2K. Newest property in pipeline.

Google Search Console performance chart for Client C: 101K impressions and 2.33K clicks
Search performance
101K impressions · 2.33K clicks

2.3% CTR, avg position 8.7. Highest-converting property.

Live demo

End-to-end pipeline experience — the self-serve interface for autonomous content generation with human-supervised quality gates

Engineering trade-offs

Quality vs. cost

O-R-A-V (Observe→Reason→Act→Validate) deterministic validation engine. Node 6 runs zero-LLM rule-based checks; Node 7 (DEMAS JIT) runs SLM-as-judge evaluation inline within content generation.

Decision: deterministic + SLM validation, 68.9% average pass rate per boundary.

Safety vs. throughput

Fail-closed design means ~31% of content is rejected. Intentional — no partial or low-quality content ever propagates downstream to production surfaces.

Decision: fail-closed at every boundary, zero tolerance.

Self-improvement

RLAIF data flywheel. DPO preference pairs generated from Node 6/7 evaluation signals for continuous model alignment.

Decision: closed-loop flywheel, autonomous preference pairs.

Scale vs. locality

5 countries require different linguistic, cultural, and regulatory context. Per-locale context injection increases prompt engineering complexity but eliminates fine-tuning per market.

Decision: context-first architecture, zero-shot locale adaptation.

Python Google ADK Vertex AI Langfuse

Deterministic gates before probabilistic agents is non-negotiable. 80% of every cycle is rule-based; only 20% is inference. That ratio is the difference between an autonomous system and an unsupervised one.

The pipeline’s writer node runs Gemma 4 26B-A4B MoE on a single A100. Standing that up took 30+ deployment iterations and 20 named failure modes. Case Study 02 documents every one.

Case Study 02

Gemma 4 MoE — Vertex AI deployment forensics

March–April 2025 16+ hours · 30+ versions 20 failure modes

Deployed Google’s Gemma 4 26B-A4B-it Mixture-of-Experts model on Vertex AI with vLLM. Documented 20 distinct failure modes across 30+ deployment iterations in a public forensic runbook.

Problem

Gemma 4 MoE was newly released with no production deployment guides. Required custom vLLM builds, GCSFUSE container configs, and careful chat template engineering.

Approach

Systematic forensic debugging across 2 deployment cycles (16+ hours, 30+ versions). Every crash, OOM, and dependency conflict documented with root-cause analysis.

Result

20 failure modes

Forensically documented across 30+ iterations — proving the engineering diligence of developing stable, cost-effective MoE inference. Currently running stable base version while advancing toward full PRD: quality, cost efficiency, intelligence, and reliability.

Deployment timeline

v1 – v10
Container bootstrap failures

GCSFUSE mount crashes, OOM on model loading, vLLM version incompatibilities. Root cause: custom Google vLLM build required specific CUDA/PyTorch matrix.

v11 – v20
Chat template & tokenizer issues

Jinja template rendering errors, special token misalignment, multi-turn conversation state corruption. Required custom chat_template.jinja authoring.

v21 – v30
Stable base inference

Working configuration locked: vLLM 0.17.2rc1.dev133 + PagedAttention + custom entrypoint.sh. Base inference stable; advancing toward full PRD.

Cycle 2
LoRA enablement — blocked

Attempted multi-LoRA serving. Identified unsolvable dependency triangle between vLLM custom build, PEFT, and model architecture. Documented as community call-to-action.

vLLM PagedAttention Vertex AI GCSFUSE

Forensic documentation outperforms hagiography. 20 named failure modes with reproduction steps are more useful to the next engineer — and more honest about what shipping a frontier MoE actually costs — than a clean “we did it” narrative.

Both systems run autonomous AI agents in production. Both need a governance kernel: cost guards, policy-as-code, deterministic state. Case Study 03 is that kernel.

Case Study 03

Antigravity-OS — AI agent governance kernel

Active 2025–present 8 core modules Apache-2.0

An open-source governance kernel for AI agents that enforces cost budgets, policy-as-code constraints, deterministic state tracking, and self-healing CI. The operational backbone for autonomous AI development environments.

Problem

AI coding agents operate without cost guardrails, state persistence, or policy enforcement. Runaway token costs and context-window rot are standard failure modes.

Approach

A governance kernel with cost enforcement (per-session and cumulative budgets), policy-as-code rules, and deterministic state tracking. MCP integrations provide tool access for managed agents.

Result

8 core modules

Provider-agnostic governance kernel. 6 plugin domains, 7 governance frameworks. Open-source on GitHub.

Python MCP Firestore GitHub Actions

Cost guards and policy-as-code are the seatbelts of agent infrastructure. They tax the happy path. They prevent runaway on the only path that matters.

The same shape recurs across the three: a deterministic spine wrapping a probabilistic engine. A pipeline whose generative node fires only after seven gates. A model whose stable inference path was extracted from twenty failures. A governance kernel whose budgets and policies precede every agent call.

The thesis is the design.