Advanced machine learning (ML) models, particularly large language models (LLMs), require scaling beyond a single machine. As open-source LLMs become more prevalent on platforms and model hubs like HuggingFace (HF), ML practitioners and GenAI developers are increasingly inclined to fine-tune these models with their private data to suit their specific needs.
However, several concerns arise: which compute infrastructure should be used for distributed fine-tuning and training? How can ML workloads be effectively scaled for data ingestion, training/tuning, or inference? How can large models be accommodated within a cluster? And how can CPUs and GPUs be optimally utilized?
Fortunately, an opinionated stack is emerging among ML practitioners, leveraging open-source libraries.
This session focuses on the integration of HuggingFace and Ray AI Runtime (AIR), enabling scaling of model training and data loading. We’ll delve into implementation details, explore the
Transformer APIs, and demonstrate how Ray AIR facilitates an end-to-end ML workflow, encompassing data ingestion, training/tuning, or inference.
By exploring the integration between HF and Ray AIR, we’ll discuss how Ray’s orchestration capabilities fulfill computation and memory requirements. Also, we’ll showcase how existing HF Transformer APIs, DeepSpeed, and Accelerate code can seamlessly integrate with Ray AIR’s Trainers and demonstrate its capabilities within this emerging component stack. Finally, we’ll demonstrate how to fine-tune an open-source LLM model with HF Transformer APIs and Ray AIR Trainers.
Speaker
Jules Damji
Lead Developer Advocate @Anyscale, MLflow Contributor, and Co-Author of "Learning Spark"
Jules S. Damji is a lead developer advocate at Anyscale Inc, an MLflow contributor, and co-author of Learning Spark, 2nd Edition. He is a hands-on developer with over 25 years of experience and has worked at leading companies, such as Sun Microsystems, Netscape, @Home, Opsware/LoudCloud, VeriSign, ProQuest, Hortonworks, and Databricks, building large-scale distributed systems. He holds a B.Sc and M.Sc in computer science (from Oregon State University and Cal State, Chico respectively), and an MA in political advocacy and communication (from Johns Hopkins University).