Abstract
his talk presents Pinterest's journey in deploying AI at massive scale, from using Reinforcement Learning to create images to building multimodal AI and Agent systems that automatically organize billions of content items. We'll explore how we evolved from fixing algorithmic bias in diffusion models to creating millions of shopping collections that drive discovery for hundreds of millions of users globally.
The presentation covers two breakthrough systems: our RL-enhanced diffusion models that reduced gender and racial bias while improving human preference alignment by 80.3%, and PinLanding, our content-first architecture that achieved 4X coverage improvement over traditional search approaches.
The RL-enhanced diffusion model work was presented at the 18th European Conference on Computer Vision (ECCV 2024) conference in Milan, Italy. Both innovations resulted in US patent applications.
Main Takeaways:
- Reinforcement Learning transforms Stable Diffusion performance
- Content-first architecture scales better than behavior-based approaches
- Production-ready multimodal AI architecture patterns with Agents
Interview:
What is the focus of your work these days?
The majority time in my day is spent on:
- Building and deploying large-scale AI systems that leverage reinforcement learning and multimodal architectures for content understanding and generation
- Developing production-ready implementations of cutting-edge research (Stable Diffusion, CLIP, Vision-Language Models) that scale to billions of content items
- Leading engineering teams that bridge research innovations with practical deployment challenges in high-traffic content platforms
- Researching novel applications of RL for improving generative models and AI agents for automated content organization across diverse industry verticals
And what was the motivation behind your talk?
The motivation stems from the massive opportunity that reinforcement learning and multimodal AI represent for any industry managing large content collections. While most companies are still experimenting with basic LLM applications, we've moved beyond that to solve fundamental challenges in content generation and organization at unprecedented scale.
Our experience demonstrates that:
- RL can dramatically improve any generative model: Our Stable Diffusion improvements aren't Pinterest-specific - they're applicable to any company using image generation for marketing, product design, or content creation
- Multimodal AI is ready for production: Our content-first architecture patterns work for any large catalog - e-commerce product databases, media libraries, document repositories, or digital asset management systems
- Scale reveals new opportunities: Moving from millions to billions of items reveals architectural insights that smaller-scale experiments miss
I want to share this because the techniques we've developed solve universal problems:
- E-commerce platforms struggling to organize massive product catalogs
- Media companies with overwhelming content libraries
- Marketing teams needing better image generation capabilities
- Any platform where users need to discover relevant content from vast collections for search traffic
Who is your session for?
ML Engineers implementing reinforcement learning for generative models or building multimodal AI pipelines in production
Data Scientists working with large-scale content catalogs, image generation, or content organization challenges
Engineering Leaders at e-commerce, media, or content platforms evaluating AI technology choices for catalog management and content discovery
Research Engineers bridging cutting-edge research (Stable Diffusion, CLIP, VLMs) with production deployment
Product Managers in retail, media, or content-heavy platforms seeking to understand the business impact of advanced AI architectures
Software Architects designing systems to handle massive content collections and user-facing AI features
Speaker

Faye Zhang
Staff Software Engineer @Pinterest, Tech Lead on GenAI Search Traffic Projects, Speaker, Expert in AI/ML with a Strong Background in Large Distributed System
Faye is a Staff Software Engineer at Pinterest, where she leads AI-driven search traffic initiatives and launched the company's first successful GenAI production experiment, driving significant user engagement growth. With a Computer Science degree from Georgia Tech and ongoing AI graduate studies at Stanford, she combines deep technical expertise in distributed systems with cutting-edge AI research. Her work spans both industry and academia, including contributions to university genomic science research. She regularly shares insights on AI innovation at technical conferences in San Francisco and Paris, focusing on scalable AI solutions that transform user experiences.