Cost-Conscious Cloud: Designing Systems that Don't Break the Bank

Abstract

Details coming soon.

From the same track

Session

How Netflix Shapes our Fleet for Efficiency and Reliability

Netflix runs on a complex multi-layer cloud architecture made up of thousands of services, caches, and databases. As hardware options, workload patterns, cost dynamics and the Netflix products evolve, the cost-optimal hardware and configuration for running our services is constantly changing.

Joseph Lynch

Principal Software Engineer @Netflix Building Highly-Reliable and High-Leverage Infrastructure Across Stateless and Stateful Services

Argha C

Staff Software Engineer @Netflix Building Highly Available, High Throughput Systems

Session

Realtime and Batch Processing of GPU Workloads

SS&C Technologies runs 47 trillion dollars of assets on our global private cloud. We have the primitives for infrastructure as well as platforms as a service like Kubernetes, Kafka, NiFi, Databases, etc.

Joseph Stein

Principal Architect of Research & Development @SS&C Technologies, Previous Apache Kafka Committer and PMC Member

Session

From ms to µs: OSS Valkey Architecture Patterns for Modern AI

As AI applications demand faster and more intelligent data access, traditional caching strategies are hitting performance and reliability limits. 

Dumanshu Goyal

Software Engineer @Airbnb - Leading Online Data Priorities, Previously @Google and @AWS

Session

One Platform to Serve Them All: Autoscaling Multi-Model LLM Serving

AI teams are moving to self-hosted inference away from hosted LLMs as fine-tuning drives model performance. The catch is scale, hundreds of variants create long-tail traffic, cold starts, and duplicated stacks.

Meryem Arik

Co-Founder and CEO @Doubleword (Previously TitanML), Recognized as a Technology Leader in Forbes 30 Under 30, Recovering Physicist

Cost-Conscious Cloud: Designing Systems that Don't Break the Bank

Abstract

Date

Track

Share

From the same track

How Netflix Shapes our Fleet for Efficiency and Reliability

Realtime and Batch Processing of GPU Workloads

From ms to µs: OSS Valkey Architecture Patterns for Modern AI

One Platform to Serve Them All: Autoscaling Multi-Model LLM Serving

Follow QCon

Contact

Menu

Conferences around the World