A camera-first onboarding tool that turned a retailer’s stockroom into a published storefront —
barcode, then image, then video recognition over a 60M+ SKU canonical catalog.
Daniel Manzela · Active Jan 2020 — Apr 2024
Co-founded with Dr. Eli Osherovich — Senior Applied Scientist at Amazon
(Alexa Shopping / Amazon Go), now leading AI & infrastructure at Google.
Bootstrapped four years with no external capital.
~50
Products / hour onboarding target
3
Computer Vision generations shipped
60M+
Unique SKUs canonicalized
Field test · in-store first scan · May 2022
Market Context
The Onboarding Bottleneck
Physical retailers need an online presence but lack the technical skills.
Manual product upload is the single largest barrier to eCommerce adoption.
storefront
Skill gap
Brick-and-mortar retailers operate without in-house digital teams.
Photographing, captioning, categorizing, and listing products is a workflow they have neither
time nor expertise to perform.
inventory_2
Manual upload cost
~3 minutes per product × 50 products = 2.5 hours per store
before a single photograph is taken. The rate-limiting step between a physical
retailer and a working online storefront.
alt_route
No frictionless path
No existing solution generated a complete product catalog from the camera alone.
Every alternative still required typing, tagging, and image curation by hand.
Product Versions
Three Generations of Vision
The product evolved through three distinct computer-vision generations,
each removing a layer of friction from the seller’s upload workflow.
VERSION 1.0
Barcode → auto-upload
Barcode + Database
A PWA-side barcode scanner powered by Google ML Kit
(with zxing-cpp-emscripten as fallback).
Each scan is enriched against the Global Product Database and pushed to a vendor’s WooCommerce store
via the WC Marketplace REST API — the target was onboarding plus 50 products in under one hour.
qr_code_scanner
Scanner: Google ML Kit, on-device · zxing-cpp-emscripten fallback
database
Auto-fill: Global Product Database lookup per UPC / EAN
inventory
Inventory: bulk upload + per-product overrides
timer
Target: store onboard + 50 products in < 1 hour
Platform foundations shipped alongside V1 — sign-up, orders, marketing, settings,
WooCommerce integration — and were inherited by every later version.
See Architecture.
VERSION 2.0
Image recognition
Image + OCR
For products without scannable barcodes, a single photo became the input.
Vision-AI OCR extracted on-package text, generated descriptions, tags, categories, and attributes,
and a Computer Vision foundation-model API cleaned up product photography by removing backgrounds.
The seller pans the camera across a shelf; frame-by-frame recognition retrieves matching SKUs
from a dedicated RAG model over the Global Product Database.
Attribute extraction expanded with synonym support, with the dataset mass-scraped per country, region, and language.
videocam
Input: video stream — frame-by-frame product recognition
hub
Retrieval: dedicated RAG over Global Product Database
manage_search
Attributes: enhanced extractor with synonym support
language
Coverage: per country / region / language scraping
Data Infrastructure
The Global Product Database
We did not ask the model to invent products. We asked it to find them.
Every Computer Vision generation was constrained by a record already in the catalog —
barcode, photo, and video became three different ways into the same retrieval layer.
Vision & Mission
One catalog, every retailer
A single canonical product record per SKU, enriched once and reused across every vendor on the platform.
A small bodega and a regional chain receive the same structured data the moment they scan a product.
Ingestion stack
From scrapers to multilingual RAG
Shipped Web scraping — UPC / BarcodeLookup / image cropping
Shipped Bright Data — managed proxy & structured ingestion
Planned Multilingual RAG — per-region language coverage
System Design
Technical Architecture
A Progressive Web App over a WooCommerce-vendor backend,
with a two-database split and a vision pipeline that swapped its front-end three times
while the retrieval layer stayed stable.
AWS · computeMongoDB · two DBsReact PWAWC Marketplace REST APIWCMp Vendors APIGoogle ML KitMixpanelGA4CI / CD
devices
PWA topology
Single React-based PWA installable across iOS, Android, and desktop. Camera, push, and offline
capabilities without an app-store dependency — updates rolled out the moment a seller refreshed.
RTL + Hebrew — Google Auto-Translate for the launch market
The architectural backbone. The catalog and the storefront were never the same thing — one
is shared across vendors, the other is private to each vendor.
Global Product Database — shared, canonical, retrieval-indexed (60M+ SKUs)
Store Product Database — vendor-scoped inventory, pricing, overrides
Integration — WC Marketplace REST API · WCMp Vendors auth
visibility
Vision stack
The product replaced its CV front-end three times. The retrieval layer behind it stayed stable —
which is why a four-year stack didn’t accumulate as much technical debt as it could have.
V1 — ML Kit on-device barcode scanning + GPD lookup
Raw field documentation from physical-store testing —
the seller, the camera, the product, and the upload happening in real time.
Captured Session · Barcode Scanner
Barcode scanning workflow
Raw testing session: scanning physical barcodes, retrieving product data from the GPD,
auto-filling and publishing to the storefront.
What you’ll see
A handheld camera centers a product barcode; the app reads it on-device,
retrieves the matching record from the Global Product Database, and pre-fills
the listing form ready for the vendor to confirm and publish to WooCommerce.
Captured Session · Computer Vision
Background removal & image normalization
Testing the Computer Vision pipeline: transforming raw smartphone photos into
commercial-grade product images with automatic background removal.
What you’ll see
A product photographed against a cluttered shop-floor background; the foundation-model
cleanup pipeline removes the background and normalizes the image to a uniform
commerce-ready presentation suitable for direct publish.
Field testing · May 2022
End-user usability session
Watching a real seller perform their first scan on the floor of their own store.
Field testing · May 2022
On-floor co-discovery
Co-discovery work alongside the retailer — observing where the camera workflow broke.
Metrics
Traction & KPIs
Bootstrapped to $10K MRR from SMB customers
over four years. Instrumented from day one against the HEART framework — see Pivot for
why the revenue line stopped scaling and what we did about it.
4 yrs
Bootstrapped, no external capital
~50
Products / hour onboarding target
$10K
MRR at plateau, from SMB customers (see Pivot)
HEART Framework
Five axes wired to the roadmap
Happiness — in-app survey + NPS after onboarding
Engagement — scans / session, products / week
Adoption — first scan, first published product, first sale
Retention — week-2 / week-4 / month-3 cohorts
Task Success — completed-upload rate per session
Cohort, NPS, and task-success figures held under prior NDA.
Instrumentation was production-grade — Mixpanel events with a HEART-aligned schema, GA4 for acquisition,
and a roadmap that took a backlog item only when a HEART signal moved.
Backlog at deprecation
What customer pain told us next
· Online payment integration
· Last-mile delivery integration
· Point-of-sale (POS) sync
· Weight-based products (deli, produce, bulk)
Each pain item entered the backlog with the user interview attached.
Roadmap priority was a function of frequency × revenue impact, not vendor preference.
We shipped the pivot instead of these.
Strategic Decision
Why We Pivoted
Field data revealed an incentive problem the product could not engineer around —
so we changed the customer.
The friction we couldn’t code away
Retailers and their on-ground employees were not incentivized to perform the setup themselves.
Onboarding stalled. Production roll-out kept slipping past the moment the product was ready.
The discovery
Big-box retailers — operating eCom sites without local visibility — turned out to be the right buyer
for a frictionless, automated, large-scale eCom-per-location setup. The need was upstream of the seller.
The Covid-19 window
The pandemic was the right macro window for digital transformation in retail —
and it was precisely the window that exposed the human-on-the-ground friction at scale.
The outcome
Pivoted from SMB self-service to enterprise frictionless solutions. That move directly seeded the
Autonomous Content Pipeline —
and the open-source spinout —
same insight, different blast radius.
Two-stage wind-down.April 2024 — active Seller App development ended; team rolled to enterprise.
The Computer Vision stack and the Global Product Database carried forward as feedstock for the next product generation.
Reflection
What I’d Do Differently
Three concrete things, named honestly. Not platitudes — the operational
detail senior recruiters and operators recognize.
L · 01
Confused a sales / ops problem for a UX problem.
Seller incentive friction looked like “the onboarding flow is too hard”
for nine months longer than it should have. The fix was not another camera generation —
it was a field-ops person who could sit next to the seller and earn the first 50 SKUs.
I’d hire that person before shipping V2.
L · 02
Made the right architectural bet for the wrong reason.
Building the GPD as a retrieval layer (not as model fine-tuning data) felt like
defensive engineering at the time. It turned out to be the only piece of the stack
that survived the pivot intact and is the load-bearing pattern of every product since.
I’d commit to retrieval-grounded CV from day one on the next zero-to-one.
L · 03
Let the HEART framework decide too late.
We instrumented the funnel correctly from week one and then let revenue conversations
override what HEART signals were saying for a full quarter. When the retention cohort
and the MRR line disagree, the cohort is right earlier. I’d put a forcing function
on that — a monthly review where the cohort number has veto power over the roadmap.