CatalogAI is a governed AI content pipeline for multi-brand retailers with tens of thousands of listings to maintain. Drop in a supplier spec sheet or paste a product row — the system writes title, short description, long description, and keywords in Thai and English, locked to the brand voice your team has agreed on. Human QA stays in the loop. Nothing ships without a human approving it.
Somewhere between 2023 and 2025, ChatGPT went from a novelty to a tab that is always open on the catalog editor's second monitor. Nobody decided this. Nobody approved it. It happened because the alternative — writing a usable Thai product description from a supplier's garbled ten-word input — takes ten minutes per SKU, and the backlog is forty thousand.
A multi-brand retailer with four banners and 110,000 SKUs across grocery, electronics, home, and construction receives supplier data daily: spec PDFs in unstructured layouts, Excel rows with inconsistent column names, product photos with a cluttered warehouse background. The catalog team has to turn all of it into Thai and English listings, in four different brand voices, before the product can go live.
They know what the existing options cost. A mid-sized retailer spends roughly $500,000 to $1.5M per year on manual listing content — freelance copywriters, in-house editors, translation agencies, agency briefings that take three rounds. The agency delivers copy that doesn't quite match the brand voice. The in-house editor rewrites it. The category manager reviews. It goes live a week late.
Meanwhile, the same retailer's third-party sellers on the open marketplaces have free AI listing tools, native to the platform, producing 100 listings a day each. A random marketplace seller has better AI tooling than the retailer's own brand team.
Staff are already pasting supplier data into ChatGPT. There is no voice profile, no approval queue, no audit trail. The warm-and-trustworthy grocery tone, the spec-forward electronics authority, the aspirational lifestyle register — they blur together into generic AI prose. The voice you spent years building is drifting, and nobody can see where.
Thai has no word boundaries, 44 consonants, 32 vowels, 5 tones, and a luxury-vs-practical register gap that is far wider than the English equivalent. A model trained predominantly on English product copy writes Thai that is either robotic, ungrammatical, or subtly off-tone in ways a Thai copywriter catches instantly and an audit log never surfaces.
PIM replacements, content workbench platforms, multi-tenant DXP products — all of them come with integration consultants, data migration plans, governance committees, and a six-figure year-one licence. The catalog team wants to ship this week. The vendor wants to schedule a discovery workshop.
"Our editor can write a usable product description in about ten minutes. She writes forty a day. We have forty thousand SKUs on that banner alone. I'll let you do the arithmetic."
Conversation with a category lead at a Thai multi-brand retailer · February 2026
CatalogAI replaces two things at once — the manual content team and the shadow ChatGPT tabs — with a single governed pipeline. The brand voice is configured by the people who own it, not by whoever is prompting today. Every generation follows that voice. Every piece of output passes through a human reviewer before anything is published. The audit trail holds. There is a web page the content editor uses. There is an open API the dev team can curl. There is a QA queue the senior reviewer lives in. There is no voice drift, because individual operators cannot override the voice.
Each brand voice is a structured specification — tone rules, register, sensory vocabulary, forbidden phrasings, reference sentences — stored once and injected into every generation. A content editor selects the voice from a dropdown; they cannot edit it at generation time. If the brand team updates the voice, every future generation reflects the update. This is the difference between a prompt and a policy.
Thai and English are generated in parallel from the same structured product data, not translated one from the other. Thai follows Thai SEO patterns — brand + product + key spec, tokenisation-aware keyword density, colloquial search variants. English follows English patterns. Neither is a word-for-word translation, which is why the Thai reads like something a Thai copywriter wrote.
Default: Google Gemini (Vertex AI, asia-southeast1 region — Singapore). Swap to NIPA Cloud AI for Thai data residency — required when catalog data touches regulated pricing, exclusive-supplier contracts, or pre-launch merchandise your legal team classifies as confidential. Same interface, different backend. The operator doesn't see the swap.
This isn't a content agency. It's the generation-and-governance layer that lets your existing catalog team ship four brand voices at a hundred times their current speed, without giving up the voice.
Paste, upload, or drop. The system doesn't care which.
The content editor has the supplier's material open. It might be a spec-sheet PDF from the electronics supplier, a row of a CSV the grocery merchandiser sent over, or a paragraph of barely-formatted text a category manager typed into chat. CatalogAI accepts all three from the same drop zone. For PDFs, Gemini 2.5 Pro reads the document directly — no separate OCR, no pre-processing. The intake service returns a normalised SKU in the standard schema: name in Thai, name in English, brand, category, key specs as a JSON object.
Four voices in the MVP. Each with its own tone spec. The editor picks one, not writes one.
A dropdown at the top of the page exposes the configured voices. For the pilot we ship four pre-configured archetypes — grocery (warm, food-expertise), electronics (technical, spec-forward), lifestyle (aspirational, elegant), construction (practical, builder-friendly) — as the baseline shape of the voice library. The full spec adds a brand-voice admin UI for marketing teams to author and refine their own voices; that arrives in the first post-pilot milestone. Each voice is a structured spec, not a prompt. The content editor has no way to reach in and change the system prompt.
Output streams in token-by-token. First Thai word in about a second. Full bilingual bundle in under fifteen.
One button. Streaming output. The model generates Thai and English titles, short descriptions (50–80 words each), long descriptions (150–300 words each), and ten search keywords per language — in parallel, in the voice the editor selected, from the structured product object. Streaming is the perceived-speed win. A full generation takes eight to twelve seconds end-to-end, but the first Thai word appears on screen in under a second. The Vertex AI context-caching layer keeps the brand-voice system prompt and JSON-schema instructions warm, so repeat generations on the same voice pay model latency only on the parts that actually change.
One SKU. Four voices. Four parallel streams. One button.
A secondary button on the output panel fans the same SKU out across all four configured voices in parallel. The browser fires four concurrent streaming requests; four columns fill in live. The editor sees — in a single view — what the same product reads like when written for grocery, electronics, lifestyle, and construction banners. The governance point lands at exactly this moment. Not "AI is fast." Not "AI knows your brand." The point is: voice drift is no longer a possible outcome, because the same model is holding four voices at the same time and not confusing them. No server-side queue. No worker container. Browsers fan out the requests, PHP-FPM serves them, Gemini returns them in parallel.
Every generation lands in the QA queue. A human approves, edits, or rejects. Nothing else publishes.
The MVP ships the queue as a side-by-side reviewer interface: supplier input on the left, AI output on the right, approve / edit / reject buttons along the bottom. Per-field actions are possible — approve the Thai short description but edit the English title, for example. Every action is logged with reviewer ID, timestamp, field, old value, new value. Edits feed the feedback loop. The system records what the reviewer changed and surfaces recurring edit patterns weekly, so the voice spec can be refined in one place rather than via ad-hoc prompt-tweaking. This is why the voice converges toward your brand rather than drifting away from it.
Copy JSON. Download the file. Or hand off to publishing once your PIM connector is live.
Once an SKU is approved, the output is available as a structured JSON blob — titles, descriptions, keywords, attributes, voice used, model used, processing time. The pilot exports via copy-to-clipboard, download, and CSV for bulk batches. Publishing directly to a PIM (SAP Hybris, Akeneo, inRiver, custom) is in the post-pilot roadmap and is the single most common integration request; it is scoped but not in the MVP. The open API (/api/v0/generate, no auth required during the pilot) is the bridge. Any dev on the retailer side can curl it the same afternoon they see the demo.
Each brand voice is a structured specification — tone rules, register, sensory vocabulary, forbidden phrasings, reference sentences — stored once and injected into every generation via context caching. Editors select a voice; they cannot rewrite it. Marketing owns voice; editors ship listings. The governance boundary is the product.
Thai and English are generated side-by-side from the same structured product object, in their own SEO patterns. Thai titles follow Thai search behaviour (tokenisation-aware, colloquial variants); English titles follow English conventions. Neither is a translation of the other. Bilingual output takes the same time as monolingual.
A single button fans one SKU out across all configured voices in parallel. Four columns stream simultaneously — grocery, electronics, lifestyle, construction — so the tone differences are visible on a single screen. This is the moment that collapses the buyer's "how do you actually lock a voice?" question.
Paste text, upload a CSV row, or drop a PDF spec sheet — all feed the same intake pipeline. Gemini 2.5 Pro reads PDFs natively (no separate OCR library, no Tesseract), returns structured JSON matching the standard schema, flags low-confidence fields for human check. One intake endpoint, three input formats.
Every generation lands in a review queue before it publishes. Reviewers approve, edit, or reject per field — approve the Thai title, edit the English long description, reject the keywords. Edits are logged with diffs, feeding the feedback loop. Nothing ships without a human approving it.
Default: Gemini 2.5 Pro on Vertex AI (asia-southeast1 — Singapore). Swap to NIPA Cloud AI for Thai data residency when required. Same UI, different backend. The toggle is a switch; no operator retraining. Residency architecture is identical to DocPlus.
Every generation, every edit, every approval is logged with SKU ID, voice used, model used, operator ID, timestamp, before/after values, and processing time. Sessions export as CSV for audit review. The shadow-ChatGPT workflow has none of this; CatalogAI's audit structure is designed for standard enterprise compliance reporting.
The same endpoints that power the UI are available as a flat REST API — /intake, /generate, /generate/stream, /voices. No auth header during the pilot, CORS open, synchronous JSON in and out (streaming endpoint via Server-Sent Events).
A multi-brand retailer running 110,000 SKUs across four banners spends somewhere between $500,000 and $1.5M a year on manual listing content — freelancers, in-house editors, translation agencies, revision rounds. CatalogAI compresses the production side of that cost by 70–90% while adding governance (voice locks, audit trail) that the manual workflow never had.
| Metric | Before CatalogAI | After CatalogAI (10-week pilot) |
|---|---|---|
| Time per listing (first-draft copy, TH + EN) | 8–12 minutes | under 1 minute |
| Cost per listing (fully loaded) | $0.50 – $5.00 | under $0.05 |
| Brand-voice consistency (reviewer rating, 1–5) | 2.5–3.5 | 4.5+ |
| Review time per SKU (reviewer pass) | 5–8 minutes | 2–3 minutes |
| Throughput per content editor (SKUs / day) | 30–40 | 150–200 |
| Audit trail completeness on AI-generated copy | 0% | 100% |
| Shadow-ChatGPT usage (self-reported, post-rollout) | present | eliminated |
Figures derived from industry-published benchmarks on manual product content costs (Semantico, 2024; Hypotenuse AI, 2024) and internal scoping against real multi-brand retailer catalogs in Thailand. Bilingual-output accuracy numbers are calibrated against our founder's prior enterprise delivery on Thai and Myanmar-language AI content systems. Individual pilot results vary by catalog mix, existing workflow, and reviewer team size.
In operational terms, a mid-sized retail catalog team recovers the equivalent of three to five full-time copy-editor salaries per year — not by firing them, but by redirecting them from typing-from-supplier-PDFs to QA-and-voice-guardianship, which is the job their title already describes. In compliance terms, you close the shadow-AI audit gap before the first regulator or auditor asks about it. In commercial terms, you go from hundreds of new SKUs per week to thousands, which is the velocity marketplace sellers have had since 2024.
| AI providers | Google Gemini 2.5 Pro on Vertex AI (default, asia-southeast1); NIPA Cloud AI (Thai data residency) |
|---|---|
| Supported input formats | Plain text paste, CSV row, PDF spec sheet (multimodal — read natively by Gemini, no separate OCR) |
| Output per SKU | TH + EN title, short description (50–80 words), long description (150–300 words), 10 TH + 10 EN keywords, normalised attributes, voice used, processing time |
| Languages | Thai (primary), English. Vietnamese scoped for post-pilot. Myanmar on roadmap. |
| Generation latency | Under 15 seconds per full TH + EN bundle, single voice. Streaming: first token in ~1s. |
| Four-voice parallel view | Client-side fan-out (browser fires 4 concurrent streaming requests). No server queue. |
| Brand voice model | Structured spec (tone, register, vocabulary, forbidden phrasings, reference sentences) injected via Vertex AI context caching |
| Concurrency | 10+ simultaneous users at pilot scale; horizontally scalable via PHP-FPM workers and Vertex AI quota |
| Session model | Session-scoped during pilot demos; persistent per-org catalogs in production deployment |
| Deployment | Web (hosted) or on-premises via Docker Compose (web container + PostgreSQL container — no Redis, no worker, no queue) |
| Stack | Laravel 13 + Livewire 4 (+ Volt) + Preline UI 4 + Tailwind CSS 3 + Alpine.js + PostgreSQL 18 |
| Infrastructure | Any modern VPS; minimum 2 vCPU / 4GB RAM for single-instance pilot |
| API | Open REST during pilot (/api/v0/intake, /generate, /generate/stream, /voices). OAuth2 + scoped keys added in production. |
| Integrations (roadmap) | SAP Hybris, Akeneo, inRiver, custom PIM via REST; marketplace open platforms for direct publishing |
CatalogAI is built to clear a Thai-retail procurement review on two fronts simultaneously — IT security and brand governance — because those are the two teams that block this purchase.
NIPA Cloud AI mode keeps all generation processing and persisted data within Thai infrastructure, satisfying Thai PDPA and the residency preferences of large retail groups with financial-services subsidiaries or regulated merchandise categories. Vertex AI mode runs in asia-southeast1 (Singapore) — low-latency for Thai users, but outside Thai borders.
The voice spec is the artefact the brand team signs off on. Once signed, the editor cannot drift it at generation time. Voice updates go through the marketing team; every future generation reflects the update. This is the audit-ready version of what shadow ChatGPT cannot give you.
Every intake, generation, edit, approval, and rejection is logged with SKU ID, voice used, model and version, operator ID, timestamp, before/after values, and processing time. Sessions and catalogs export as CSV on request. Retention configurable from 0 days to 7 years.
When residency mode is set to NIPA, pre-launch SKUs (exclusive merchandise, unreleased pricing, category-exclusive deals) do not leave Thai borders at any point in the pipeline. The Vertex AI path is available for non-sensitive catalogs where latency trumps residency.
TLS 1.3 in transit; AES-256 at rest for persisted catalogs and edit history; encrypted file-level storage on NIPA Cloud AI infrastructure when that mode is selected.
Audit-trail structure is designed for standard enterprise compliance reporting. Formal certification is on the roadmap for production deployments; pilot deployments can be configured to operate within existing client ISO / SOX envelopes.
CatalogAI deploys as a pilot with a written KPI and an honest ending. Either the measured outcome clears the KPI — throughput, voice-consistency rating, cost-per-listing — and we scale, or it doesn't and we part ways with the pilot report in your hands. No open-ended contracts.
Operational walkthrough with the catalog team, category managers, and brand-voice owners. Shadow one SKU through the current workflow end-to-end. Author the voice specs for one to two brands with the marketing team — this is the deliverable that matters most, because it is what the pilot is graded against. Fixed pilot price confirmed.
CatalogAI deployed to one banner — one brand, one category team, real supplier data. The first 500 SKUs run through the full pipeline with side-by-side QA review. The voice specs are calibrated weekly based on reviewer edits. By week 6, the system is running 2,000 SKUs through the pipeline per week at steady state. Weekly reviews are working-software demos, not slide decks.
Measurement against the written KPIs: throughput per editor, cost per listing, voice-consistency rating by marketing, reviewer approval rate. Honest gap analysis. Recommendation: scale to remaining banners, continue pilot with adjustments, or walk away. All three endings are valid.
If the pilot cleared the KPI: roll out to additional banners, author voice specs for each, deploy PIM connector for direct publishing, add photo-pipeline scope if required, enable Vietnamese if expansion is in play. If it didn't: we hand over the pilot findings and part ways.
Fixed-price pilot. Walk-away clause. Roadmap influence for design partners. No surprises.
Ten weeks, fixed price. Deploy to one banner, one or two voice specs, up to 2,000 SKUs.
Hosted multi-tenant. Priced by monthly SKU volume. For teams past pilot.
Annual licence, unlimited SKUs, on-premises deployment available.
In Vertex AI mode: structured product data and generated outputs are sent to Google's Vertex AI endpoint in asia-southeast1 (Singapore). Google's Vertex AI terms apply — data is not used for model training under the enterprise API. Suitable for non-sensitive public-catalog content.
In NIPA Cloud AI mode: all generation happens on NIPA Cloud AI infrastructure within Thailand. Catalog data never leaves Thai borders. Suitable for pre-launch merchandise, exclusive-supplier contracts, regulated pricing.
In either mode: audit logs and edit history are stored in the CatalogAI database (hosted or on-premises per your deployment choice).
The demo takes ninety seconds. No account, no commitment, no "our sales team will be in touch." If the brand-voice lock works on a supplier SKU you bring, you'll see it in the first thirty seconds. If it doesn't, the tool tells you (reviewer edit diffs, low-confidence flags) honestly.
CatalogAI didn't start from scratch. The bilingual output quality, the Thai-first generation pattern, and the voice-lock engineering are built on our founder's prior enterprise delivery across Southeast Asia (2017–2026) — including AI content systems, Myanmar and Thai-language chatbot pipelines, and supplier-onboarding workflows processed at scale through production enterprise platforms. Three million identity and content records have been processed through systems our founder has previously shipped.
That history is why CatalogAI treats Thai as the primary language rather than a localisation target, why the voice-lock model is structured rather than prompt-templated, and why the accuracy figures on this page reflect real-world pilot expectations rather than marketing benchmarks.
Read the full track record →asia-southeast1 (Singapore), which is low-latency for Thai users but outside Thai borders. Retailers with strict pre-launch merchandise confidentiality or regulated pricing requirements should default to NIPA mode.