Maximizing Deep Learning Performance: Hardware and Software Innovations for Optimizing AI Workloads

As deep learning continues to drive advancements across various industries, efficiently navigating the landscape of specialized AI hardware has huge impact in cost and speed of operation. In addition, unleashing the full potential of these hardware through appropriate software stacks can be a daunting task.

This talk explores these advancements, focusing on enhanced AI capabilities in processors, specialized cores in GPUs, and optimized architectures in accelerators. Additionally, we will discuss software advancements that unlock the full potential of this hardware, such as optimized instruction sets, high-speed interconnects, and scalable infrastructures. By examining how these technologies and software enhancements cater to tasks like pretraining, fine-tuning, and inference, attendees will gain insights into selecting the most suitable hardware and software combinations for their AI workloads.


Speaker

Bibek Bhattarai

AI Technical Lead @Intel, Computer Scientist Invested in Hardware-Software Optimization, Building Scalable Data Analytics, Mining, and Learning Systems

Bibek is an AI Technical Lead at Intel, where he collaborates with customers to optimize the performance of their AI workloads across various deployment platforms, including cloud, on-premises, and hybrid environments. These workloads involve pertaining, fine-tuning, and deployment of state-of-the-art deep learning models using cutting-edge AI-specialized hardware in the form of CPUs, GPUs, and AI Accelerators.

Bibek holds a Doctorate in Computer Science and Engineering from George Washington University, where his research focused on large-scale graph computing, mining, and learning technologies. He is keenly interested in HW/SW optimization of various workloads including Graph Computing, Deep Learning, and parallel computing.

Read more
Find Bibek Bhattarai at:

From the same track

Session

Evaluating and Deploying State-of-the-Art Hardware to Meet the Challenges of Modern Workloads

Wednesday Nov 20 / 01:35PM PST

At GEICO we are on a journey to entirely modernize our Infrastructure. We are building an open-source, cloud-agnostic hybrid stack to run across public and on prem private cloud infrastructure without having to expose vendor specific stacks to our application developers.

Speaker image - Rebecca Weekly

Rebecca Weekly

VP of Infrastructure @GEICO

Session

High-Resolution Platform Observability

Wednesday Nov 20 / 02:45PM PST

Many observability tools fail to provide us with the relevant insights for understanding hardware health and utilization.

Speaker image - Brian Martin

Brian Martin

Co-founder and Software Engineer @IOP Systems, Focused on High-Performance Software and Systems, Previously @Twitter

Session

Optimizing Custom Workloads with RISC-V

Wednesday Nov 20 / 11:45AM PST

This talk will explore how RISC-V architecture can accelerate custom workloads, focusing on AI/ML applications. We’ll start by examining the RISC-V ecosystem and its increasing relevance in the software development landscape.

Speaker image - Ludovic Henry

Ludovic Henry

Member of Technical Staff @Rivos, Performance-Minded Engineer, Hardware & Software, Previously @Xamarin, @Microsoft, @Datadog

Session

Unleashing Llama's Potential: CPU-Based Fine-Tuning

Wednesday Nov 20 / 03:55PM PST

Details coming soon.

Speaker image - Anil Rajput

Anil Rajput

AMD Fellow, Software System Design Eng. Java Committee Chair @SPEC, Architected Industry Standard Benchmarks and Authored Best Practices Guides for Platform Engineering and Cloud

Speaker image - Dr. Rema Hariharan

Dr. Rema Hariharan

PRincipal Engineer @AMD, Seasoned Performance Engineer With a Base in Quantitative Sciences and a Penchant for Root-Causing