Abstract
Delivering AI-powered features in mobile apps is not just about calling an LLM API. It's about crafting fast, reliable, and engaging user experiences. In this talk, I’ll share practical lessons from designing and scaling LLM-driven experiences in mobile apps, where thoughtful frontend architecture and UX design made the difference between a demo and a product.
We’ll explore how to architect for speed and interactivity, when to use on-device LLMs versus backend inference, and how to design interfaces that gracefully handle latency, ambiguity, and errors from generative AI. Along the way, we’ll compare patterns used in apps like Perplexity, ChatGPT, and examine the trade-offs that shape responsive, cost-effective, and scalable frontend systems.
Whether you're experimenting with LLMs in your product or already in production, you’ll leave with concrete patterns to make your AI-powered experiences feel fast, native, and user-first.