Loading video player...
1/ Scaling Agentic AI—From Prototypes to Swarm Assistants with Idan Zalzberg, CTO of Agoda, moderated by Koravich Sangkaew, AI Engineer of SCBX. How a global OTA is moving from internal tools to multi-agent systems that plan, price, and book end-to-end. 2/ Idan has spent 12 years at Agoda, now as CTO. Agoda sees itself as a tech company that happens to do travel. AI long predates GenAI there, but the team jumped on GenAI to accelerate both internal productivity and product innovation. 3/ Today Agoda runs 200+ in-house GenAI use cases. Their roadmap: Empower (internal productivity across dev, legal, ops), Enhance (better experiences without changing core flows, e.g., support copilots), and Evolve (rethink the OTA itself). 4/ Evolution in practice: single prompts → retrieval for context → tool/function calling → agent loops with feedback → specialized multi-agent “swarms” that hand off tasks. One agent can’t hold all tools and context; specialization wins. 5/ North star: an AI concierge that covers the trip end-to-end—from inspiration to returning home. Products along this line already exist, but the journey is ongoing and long-term. 6/ Biggest blocker Idan sees: evaluation. Knowing an agent truly does its job remains the hardest problem. There’s progress, but it isn’t solved. 7/ On buying AI: start from the problem, not the tech. Latency budgets depend on the workflow. Autocomplete that waits 10s is useless; a background agent can take an hour. ROI drives cost tolerance; 99.9% quality can be worth 10× the price vs 90%. 8/ PoC expectations: high uptime (think 99.99). Consistency matters more than a perfect demo. If it works only 40–60% of the time, the operational overhead kills value. Prioritize dependable quality. 9/ Ship fast without analysis-paralysis: pick one OKR and guardrails as SLAs. Example: ≥90% positive feedback, ≤2% hallucinations, and ≤10-second response time. Use offline evals to simulate, then validate with real users in production. 10/ Build vs buy: Agoda almost never self-hosts LLMs. APIs change fast and are highly competitive; using them keeps you on the frontier. Self-hosting adds cost, complexity, and drag unless it’s a narrow, non-LLM niche. 11/ Latency vs quality: start with high quality to prove the ceiling, then optimize. Tactics include faster provider tiers, smaller models, and tighter context. A strong baseline lets you quantify the exact trade-offs. 12/ “Managers who code”: staying hands-on keeps leaders close to evolving tech, builds trust with engineers, and models good practices (tests, docs). The goal isn’t lines of code—it’s a successful team, and hands-on adds tools to get there. 13/ Prevent prompt/config drift across many services: give each agent a narrow, testable scope (e.g., property Q&A vs flight search). Invest in observability to replay multi-agent conversations and trace hand-offs end-to-end. 14/ A pragmatic 3-metric dashboard: adoption and repeat use; micro/macro feedback on usefulness and time saved; and task completion/impact (was code accepted, was the summary used, did the agent finish the job). 15/ On “data as a booby trap”: be science-driven, not blindly data-driven. Small or inconclusive datasets shouldn’t be tortured into answers. Use the scientific method, be honest about uncertainty, and apply judgment when the data can’t decide.