A Framework for Building Micro Metrics for LLM System Evaluation

LLM accuracy is a challenging topic to address and is much more multi dimensional than a simple accuracy score. In this talk we’ll dive deeper into how to measure LLM related metrics, going through examples, case studies and techniques beyond just a single accuracy and score. We’ll discuss how to create, track and revise micro LLM metrics to have granular direction for improving LLM models.


Speaker

Denys Linkov

Head of ML @Voiceflow, Linkedin Learning Instructor

Denys leads Enterprise AI at Voiceflow, is a ML Startup Advisor and Linkedin Learning Course Instructor. He's worked with 50+ enterprises in their conversational AI journey, and his Gen AI courses have helped 150,000+ learners build key skills. He's worked across the AI product stack, being hands-on building key ML systems, managing product delivery teams, and working directly with customers on best practices.

Read more
Find Denys Linkov at:

From the same track

Session

LLM Powered Search Recommendations and Growth Strategy

In this deep exploration of employing Large Language Models (LLMs) for enhancing search recommendation systems, we will conduct a technical deep dive into the integral aspects of developing, fine-tuning, and deploying these advanced models.

Speaker image - Faye Zhang

Faye Zhang

Senior Software Engineer

Session

Navigating LLM Deployment: Tips, Tricks, and Techniques

Self-hosted Language Models are going to power the next generation of applications in critical industries like financial services, healthcare, and defense.

Speaker image - Meryem Arik

Meryem Arik

Co-Founder @TitanML

Session

GenAI for Productivity

At Wealthsimple, we leverage Generative AI internally to improve operational efficiency and streamline monotonous tasks. Our GenAI stack is a blend of tools we developed in house and third party solutions.

Speaker image - Mandy Gu

Mandy Gu

Senior Software Development Manager @Wealthsimple

Session

10 Reasons Your Multi-Agent Workflows Fail and What You Can Do About It

Multi-agent systems – a setup where multiple agents (generative AI models with access to tools) collaborate to solve complex tasks – are an emerging paradigm for building applications.

Speaker image - Victor Dibia

Victor Dibia

Principal Research Software Engineer @Microsoft Research