Navigating LLM Deployment: Tips, Tricks, and Techniques

Self-hosted Language Models are going to power the next generation of applications in critical industries like financial services, healthcare, and defense. Self-hosting LLMs, as opposed to using API-based models, comes with its own host of challenges - as well as needing to solve business problems, engineers need to wrestle with the intricacies of model inference, deployment and infrastructure. In this talk we are going to discuss the best practices in model optimisation, serving and monitoring - with practical tips and real case-studies.


Speaker

Meryem Arik

Co-Founder @TitanML

Meryem is a recovering physicist and the co-founder of TitanML - TitanML is an NLP development platform that focuses on deployability of LLMs - allowing businesses to build smaller and cheaper deployments of language models with ease. The TitanML platform automates much of the difficult MLOps and Inference Optimisation science to allow businesses to build and deploy state-of-the-art language models with ease. She has been recognised as a technology leader in the Forbes 30 Under 30 list.

Read more

From the same track

Session

LLM Powered Search Recommendations and Growth Strategy

In this deep exploration of employing Large Language Models (LLMs) for enhancing search recommendation systems, we will conduct a technical deep dive into the integral aspects of developing, fine-tuning, and deploying these advanced models.

Speaker image - Faye Zhang

Faye Zhang

Senior Software Engineer

Session

GenAI for Productivity

At Wealthsimple, we leverage Generative AI internally to improve operational efficiency and streamline monotonous tasks. Our GenAI stack is a blend of tools we developed in house and third party solutions.

Speaker image - Mandy Gu

Mandy Gu

Senior Software Development Manager @Wealthsimple

Session

10 Reasons Your Multi-Agent Workflows Fail and What You Can Do About It

Multi-agent systems – a setup where multiple agents (generative AI models with access to tools) collaborate to solve complex tasks – are an emerging paradigm for building applications.

Speaker image - Victor Dibia

Victor Dibia

Principal Research Software Engineer @Microsoft Research

Session

A Framework for Building Micro Metrics for LLM System Evaluation

LLM accuracy is a challenging topic to address and is much more multi dimensional than a simple accuracy score. In this talk we’ll dive deeper into how to measure LLM related metrics, going through examples, case studies and techniques beyond just a single accuracy and score.

Speaker image - Denys Linkov

Denys Linkov

Head of ML @Voiceflow, Linkedin Learning Instructor