From Content to Agents: Scaling LLM Post-Training Through Real-World Applications and Simulation

Abstract

This talk presents a comprehensive journey through modern AI post-training techniques, from Pinterest's production-scale content discovery systems to enterprise agent training through Veris AI’s simulation. We'll explore how reinforcement learning and supervised fine-tuning bridge the critical gap between base model capabilities and real-world performance across two distinct but complementary domains.

We begin by exploring industry advances in RL-enhanced diffusion models and their impact on bias reduction and human preference alignment. We then dive into Pinterest's implementation of these techniques at scale for content generation via PinLanding, a multimodal content-first architecture that turns billions of content into shopping collections.

We then transition to the broader challenge of agent training, presenting a general-purpose simulation sandbox approach that generates high-fidelity training data for task-based agents. This system bridges theory and practice, showing how to transform LLM knowledge into agent experience through controlled environments that mirror real enterprise workflows.

Both systems demonstrate how post-training techniques (RL, SFT, and curriculum learning) solve the "demo-to-production" gap that plagues AI deployments, whether for content generation or autonomous task execution.
 

Interview:

What is the focus of your work these days?

The majority time in my day is spent on:

  • Building and deploying large-scale AI systems that leverage reinforcement learning and multimodal architectures for content understanding and generation
  • Developing production-ready implementations of cutting-edge research (Stable Diffusion, CLIP, Vision-Language Models) that scale to billions of content items
  • Leading engineering teams that bridge research innovations with practical deployment challenges in high-traffic content platforms
  • Researching novel applications of RL for improving generative models and AI agents for automated content organization across diverse industry verticals
     

And what was the motivation behind your talk?

The motivation stems from the massive opportunity that reinforcement learning and multimodal AI represent for any industry managing large content collections. While most companies are still experimenting with basic LLM applications, we've moved beyond that to solve fundamental challenges in content generation and organization at unprecedented scale.

Our experience demonstrates that:

  • RL can dramatically improve any generative model: Our Stable Diffusion improvements aren't Pinterest-specific - they're applicable to any company using image generation for marketing, product design, or content creation
  • Multimodal AI is ready for production: Our content-first architecture patterns work for any large catalog - e-commerce product databases, media libraries, document repositories, or digital asset management systems
  • Scale reveals new opportunities: Moving from millions to billions of items reveals architectural insights that smaller-scale experiments miss

I want to share this because the techniques we've developed solve universal problems:

  • E-commerce platforms struggling to organize massive product catalogs
  • Media companies with overwhelming content libraries
  • Marketing teams needing better image generation capabilities
  • Any platform where users need to discover relevant content from vast collections for search traffic

Who is your session for?

ML Engineers implementing reinforcement learning for generative models or building multimodal AI pipelines in production

Data Scientists working with large-scale content catalogs, image generation, or content organization challenges

Engineering Leaders at e-commerce, media, or content platforms evaluating AI technology choices for catalog management and content discovery

Research Engineers bridging cutting-edge research (Stable Diffusion, CLIP, VLMs) with production deployment

Product Managers in retail, media, or content-heavy platforms seeking to understand the business impact of advanced AI architectures

Software Architects designing systems to handle massive content collections and user-facing AI features


Speaker

Faye Zhang

Staff Software Engineer @Pinterest, Tech Lead on GenAI Search Traffic Projects, Speaker, Expert in AI/ML with a Strong Background in Large Distributed System

Faye Zhang is a staff AI engineer and tech lead at Pinterest, where she leads Multimodal AI work for search traffic discovery, driving significant user growth globally. She combines expertise in large-scale distributed systems with cutting-edge NLP and AI Agent research pursuits at Stanford. She also volunteers in AI x genomic science for mRNA sequence analysis with work published at multiple science journals. As a recognized thought leader, Faye regularly shares insights at conferences across San Francisco and Paris.

Read more
Find Faye Zhang at:

Speaker

From the same track

Session

Dynamic Moments: Weaving LLMs into Deep Personalization at DoorDash

Tuesday Nov 18 / 10:35AM PST

In this talk, we’ll walk through how DoorDash is redefining personalization by tightly integrating cutting-edge large language models (LLMs) with deep learning architectures such as Two-Tower Embeddings (TTE) and Multi-Task Multi-Label (MTML) models.

Speaker image - Sudeep Das

Sudeep Das

Head of Machine Learning and Artificial Intelligence, New Business Verticals @DoorDash, Previously Machine Learning Lead @Netflix, 15+ Years in Machine Learning

Speaker image - Pradeep Muthukrishnan

Pradeep Muthukrishnan

Head of Growth for New Business Verticals @DoorDash, Previously Founder & CEO @TrustedFor, 15+ Years in Machine Learning

Session

Automating the Web With MCP: Infra That Doesn’t Break

Tuesday Nov 18 / 02:45PM PST

AI agents are only as strong as the infrastructure beneath them. In this talk, we’ll walk through the architecture behind Browserbase’s model context protocol (MCP), built to support stateful browser automation at scale.

Speaker image - Paul Klein

Paul Klein

Founder @Browserbase, previously Director of Self-Service & Engineering Manager @Mux, Co-Founder & CTO @Stream Club, Technical Lead @Twilio Inc.

Session

Engineering at AI Speed: Lessons from the First Agentically Accelerated Software Project

Tuesday Nov 18 / 01:35PM PST

Claude Code is the first developer tool built specifically to maximize AI development velocity.

Speaker image - Adam Wolff

Adam Wolff

Engineer and Individual Contributor to Claude Code @Anthropic, Previously @Robinhood, @Facebook

Session

Deep Research for Enterprise: Unlocking Actionable Intelligence from Complex Enterprise Data with Agentic AI

Tuesday Nov 18 / 11:45AM PST

Deep Research as a consumer product redefined the AI space delivering true impact to many by searching through hundreds of websites, deeply thinking through the content, and generating a comprehensive report.

Speaker image - Vinaya Polamreddi

Vinaya Polamreddi

Staff ML Engineer; Agentic AI @Glean; Previously @Apple, @Meta, and @Stanford

Session

Improving Meta Generative Ad Text using Reinforcement Learning

Tuesday Nov 18 / 05:05PM PST

Reinforcement Learning with Performance Feedback (RLPF) unlocks a new way of turning generic GenAI models into customized models fine-tuned for specific tasks. This approach is especially powerful when combined with in-house data and performance metrics.

Speaker image - Alex Nikulkov

Alex Nikulkov

Research Scientist (RL lead for Monetization GenAI) @Meta