Improving Meta Generative Ad Text using Reinforcement Learning

Abstract

Reinforcement Learning with Performance Feedback (RLPF) unlocks a new way of turning generic GenAI models into customized models fine-tuned for specific tasks. This approach is especially powerful when combined with in-house data and performance metrics. In this talk we highlight the application of RLPF to the ad text generation system on the Facebook platform.

The presentation covers the core technical components required for production RLHF systems: preference data collection methodologies, reward model training approaches, and policy optimization techniques that maintain stability in production environments. We'll explore the theoretical foundations underlying these systems and how they translate into practical engineering solutions.

Speaker

Alex Nikulkov

Research Scientist (RL lead for Monetization GenAI) @Meta

Alex Nikulkov

Research Scientist (RL lead for Monetization GenAI) @Meta

From the same track

Session

Dynamic Moments: Weaving LLMs into Deep Personalization at DoorDash

Tuesday Nov 18 / 03:55PM PST

In this talk, we’ll walk through how DoorDash is redefining personalization by tightly integrating cutting-edge large language models (LLMs) with deep learning architectures such as Two-Tower Embeddings (TTE) and Multi-Task Multi-Label (MTML) models.

Sudeep Das

Head of Machine Learning and Artificial Intelligence, New Business Verticals @DoorDash, Previously Machine Learning Lead @Netflix, 15+ Years in Machine Learning

Pradeep Muthukrishnan

Head of Growth for New Business Verticals @DoorDash, Previously Founder & CEO @TrustedFor, 15+ Years in Machine Learning

Session AI Agents

From Content to Agents: Scaling LLM Post-Training Through Real-World Applications and Simulation

Tuesday Nov 18 / 02:45PM PST

This talk presents a comprehensive journey through modern AI post-training techniques, from Pinterest's production-scale content discovery systems to enterprise agent training through Veris AI’s simulation.

Faye Zhang

Staff Software Engineer @Pinterest, Tech Lead on GenAI Search Traffic Projects, Speaker, Expert in AI/ML with a Strong Background in Large Distributed System

Andi Partovi

Co-Founder @Veris AI, Making AI Agents World-Ready

Session AI/ML

Automating the Web With MCP: Infra That Doesn’t Break

Tuesday Nov 18 / 05:05PM PST

AI agents are only as strong as the infrastructure beneath them. In this talk, we’ll walk through the architecture behind Browserbase’s model context protocol (MCP), built to support stateful browser automation at scale.

Paul Klein

Founder @Browserbase, previously Director of Self-Service & Engineering Manager @Mux, Co-Founder & CTO @Stream Club, Technical Lead @Twilio Inc.

Session AI Agents

Engineering at AI Speed: Lessons from the First Agentically Accelerated Software Project

Tuesday Nov 18 / 10:35AM PST

Claude Code is the first developer tool built specifically to maximize AI development velocity.

Adam Wolff

Engineer and Individual Contributor to Claude Code @Anthropic, Previously @Robinhood, @Facebook

Session

Engineering AI for Creativity and Curiosity on Mobile

Tuesday Nov 18 / 11:45AM PST

This talk shares practical lessons from building production-grade AI for creativity and curiosity on mobile devices.

Bhavuk Jain

Tech Lead @Google

Improving Meta Generative Ad Text using Reinforcement Learning

Abstract

Speaker

Alex Nikulkov

Speaker

Alex Nikulkov

Date

Location

Track

Topics

Video

Slides

Share

From the same track

Dynamic Moments: Weaving LLMs into Deep Personalization at DoorDash

From Content to Agents: Scaling LLM Post-Training Through Real-World Applications and Simulation

Automating the Web With MCP: Infra That Doesn’t Break

Engineering at AI Speed: Lessons from the First Agentically Accelerated Software Project

Engineering AI for Creativity and Curiosity on Mobile

Follow QCon

Contact

Menu

Conferences around the World