Caching Beyond RAM: The Case for NVMe | QCon San Francisco 2018

Next QConSF Conference: Applied AI for Developers QCon.ai April 2019

What You’ll Learn

Explore the possibility of using new storage devices to reduce DRAM dependency for cache workloads.
Understand the state of art available today for distributed cache system.
Hear about use cases that optimize for different cache workloads.

Abstract

Caching architectures at every layer of the stack embody an implicit tradeoff between performance and cost. These tradeoffs however are constantly shifting: new inflection points can emerge alongside advances in storage technology, changes in workload patterns, or fluctuations in hardware supply and demand.

In this talk, we will explore the design ramifications of the increasing cost of RAM on caching systems. While RAM has always been expensive, DRAM prices have risen by over 50% in 2017, and high densities of RAM involve multi-socket NUMA machines, bloating power and overall costs. Concurrently, alternative storage technologies such as Flash and Optane continue to improve. They have specialized hardware interfaces, consistent performance, high density, and relatively low costs. While there is increasing economic incentive to explore offloading caching from RAM onto NVMe or NVM devices, the implications for performance are still not widely understood.

Question:

What is the focus of your work today?

Answer:

I evaluate hardware and software improvements for distributed cache systems.

Question:

What’s the motivation for this talk?

Answer:

I am exploring the possibility of using new storage devices to reduce DRAM dependency for cache workloads.

Caching systems have historically been limited to just RAM. They're written to a lot, they're read from a lot, and so latency matters an awful lot (given what people are using them on).

Recently the cost of RAM has ballooned. Databases have also gotten a lot faster, making use of LSM trees, B+ trees, and novel Flash devices.

I was thinking hard about how cache systems can leapfrog or stay ahead of database systems again. That's why I sat down and did a thorough analysis of NVMe-based systems. Intel was gracious enough to help along with testing their Optane Persistent Memory as well.

Distributed caches are at some point individual pieces of hardware. We need to come back and re-evaluate what’s possible on the cache nodes. The focus of this talk is to discuss some of what I found.

Question:

How do you plan to discuss this with the audience?

Answer:

The use cases are probably the most interesting way of talking about it. In my blog post on NVMe caches, I went through a couple of examples with diagrams. I'll do more of that with the actual slides—more concrete examples of what things can be used this way. There is a limit to what workloads can be used on NVMe devices still. It's mostly figuring out how people can quickly identify, 'I can actually benefit from this new system, how can I quickly put it to use?' or, 'there's a thing we never imagined we could do before but now it's possible.' Netflix has been taking advantage and rolling out petabytes of this stuff to launch new machine learning platforms.

Question:

What have you found through your testing that is a bad use case?

Answer:

Any really small objects. In an average case, people may have 50 percent of cache memory used by objects that are fairly large and 50 percent that are fairly small. But in some use cases it's all very, very small—a couple hundred bytes or less. In those circumstances, device backed storage doesn't really help them that much.

Question:

Who is the target persona that you envision?

Answer:

I'm looking for the tail latency folks. At this point, I'm looking for project leads and application designers. Every time somebody comes up with a new idea, a new project, or is reevaluating the cost of an old project, they have to look at what technologies are available. They have to ask, 'what can I physically do, what am I limited to via cost?' I want those folks to be aware that these things are new options available to them to use while designing new applications or reevaluating old ones.

The talk is for Performance minded folk, or anyone who needs frequent access to a lot of data, for example ML facts.

Question:

The cost efficiencies we're talking about, are they only realized at incredibly large scale or are you seeing them realized at smaller scale?

Answer:

I put out a blog post showing how somebody running a $20 virtual machine can save themselves $40 a month by avoiding renting more RAM. I'm scaling all the way down. Also, there was another company recently that rolled it out with around 10 hosts and they 10x'd the amount of cache they had for the same amount of money and halved their back-end load. And this is a fairly popular website, just not Facebook or Netflix scale.

I’d really like engineers at various different scales to walk away knowing it’s possible to exploit cache systems for entirely new problems, and that there are new opportunities to reduce the cost of cache (or increase cache and reduce backend cost).

Question:

What is the technology problem that keeps you up at night?

Answer:

The industry slowdown keeps me up at night in two ways. RAM isn't getting much denser much faster. I think last year or the year before was the first time ever that performance per watt cost on CPUs didn't actually improve. That is terrifying because everybody's financial projections are expecting this to happen every 18 to 24 months.

On the other hand, it's great because I'm a performance person, so at some point, the cost overruns will hit the budget, and I'll have better job security. Kidding, mostly!

There are people just disregarding cache systems lately. They're trying to fix their issues by applying the same data structures and storage engines they use for databases, and I'm thinking, 'you have your database, why are you trying to architect it the same way as your cache?' You still want your cache to beat your database by 10x.

These are two of the things that keep me up at night.

Speaker: Alan Kasindorf

OSS Memcached Project Maintainer, previously Memcache / Mcrouter @Facebook & Dir of Edge Engineering @Fastly

Website scalability, distributed caching system, and performance addict. Enjoy contributing to and learning from OSS.

Find Alan Kasindorf at

Speaker page

@dormando

Security Researcher, Leader, Advisor @Netflix

William Bengtson

Reducing Risk of Credential Compromise @Netflix

Sr. Cloud Security Engineer @Netflix

Travis McPeak

Taking the Canary Out of the Coal Mine

Staff Security Engineer @Cruise Automation

Mike Ruth

Using Data to Measure Risk in Cyber Systems

Director of Cyber Risk @QadiumInc

Marshall Kuypers

Security & Psychology: Demotivating Persistent Threats

Engineering Director @ShapeSecurity & JavaScript Expert

Jarrod Overson

Fairness, Transparency, and Privacy in AI @LinkedIn

Tech Lead Fairness, Transparency, Explainability & Privacy Efforts @LinkedIn

Krishnaram Kenthapadi

Jupyter Notebooks: Interactive Visualization Approaches

Senior Researcher in the Quantitative Financial Research Group @Bloomberg

Chakri Cherukuri

Nearline Recommendations for Active Communities @LinkedIn

Senior Manager & Heading AI for Growth and Communication Relevance @LinkedIn

Hema Raghavan

Open Source Robotics: Hands on with Gazebo and ROS 2

Software Engineer @OpenRoboticsOrg

Louise Poubel

Tracks

Monday, 5 November

Microservices / Serverless Patterns & Practices

Evolving, observing, persisting, and building modern microservices
Practices of DevOps & Lean Thinking

Practical approaches using DevOps & Lean Thinking
JavaScript & Web Tech

Beyond JavaScript in the Browser. Exploring WebAssembly, Electron, & Modern Frameworks
Modern CS in the Real World

Thoughts pushing software forward, including consensus, CRDT's, formal methods, & probabilistic programming
Modern Operating Systems

Applied, practical, & real-world deep-dive into industry adoption of OS, containers and virtualization, including Linux on Windows, LinuxKit, and Unikernels
Optimizing You: Human Skills for Individuals

Better teams start with a better self. Learn practical skills for IC

Tuesday, 6 November

Architectures You've Always Wondered About

Next-gen architectures from the most admired companies in software, such as Netflix, Google, Facebook, Twitter, & more
21st Century Languages

Lessons learned from languages like Rust, Go-lang, Swift, Kotlin, and more.
Emerging Trends in Data Engineering

Showcasing DataEng tech and highlighting the strengths of each in real-world applications.
Bare Knuckle Performance

Killing latency and getting the most out of your hardware
Socially Conscious Software

Building socially responsible software that protects users privacy & safety
Delivering on the Promise of Containers

Runtime containers, libraries, and services that power microservices

Wednesday, 7 November

Applied AI & Machine Learning

Applied machine learning lessons for SWEs, including tech around TensorFlow, TPUs, Keras, PyTorch, & more
Production Readiness: Building Resilient Systems

More than just building software, building deployable production ready software
Developer Experience: Level up your Engineering Effectiveness

Improving the end to end developer experience - design, dev, test, deploy, operate/understand.
Security: Lessons Attacking & Defending

Security from the defender's AND the attacker's point of view
Future of Human Computer Interaction

IoT, voice, mobile: Interfaces pushing the boundary of what we consider to be the interface
Enterprise Languages

Workhorse languages found in modern enterprises. Expect Java, .NET, & Node in this track

This Year's Schedule

The all-new QCon app!

Available on iOS and Android

The new QCon app helps you make the most of your conference experience. Easily browse and follow the conference schedule, star the talks you want to attend, and keep tabs on your personal itinerary. Download the app now for free on iOS and Android.

Track: Modern Operating Systems

Location: Bayview AB

Duration: 11:50am - 12:40pm

Day of week: Monday

Level: Advanced

Persona: Backend Developer, Developer, DevOps Engineer

What You’ll Learn

Abstract

Speaker: Alan Kasindorf

Find Alan Kasindorf at

Similar Talks

Tracks

Monday, 5 November

Microservices / Serverless Patterns & Practices

Practices of DevOps & Lean Thinking

JavaScript & Web Tech

Modern CS in the Real World

Modern Operating Systems

Optimizing You: Human Skills for Individuals

Tuesday, 6 November

Architectures You've Always Wondered About

21st Century Languages

Emerging Trends in Data Engineering

Bare Knuckle Performance

Socially Conscious Software

Delivering on the Promise of Containers

Wednesday, 7 November

Applied AI & Machine Learning

Production Readiness: Building Resilient Systems

Developer Experience: Level up your Engineering Effectiveness

Security: Lessons Attacking & Defending

Future of Human Computer Interaction

Enterprise Languages

The all-new QCon app!

Available on iOS and Android

Presentation: Caching Beyond RAM: The Case for NVMe

Track: Modern Operating Systems

Location: Bayview AB

Duration: 11:50am - 12:40pm

Day of week: Monday

Level: Advanced

Persona: Backend Developer, Developer, DevOps Engineer

More talks on:

Share this on:

What You’ll Learn

Abstract

Speaker: Alan Kasindorf

Find Alan Kasindorf at

Similar Talks

Tracks

Monday, 5 November

Tuesday, 6 November

Wednesday, 7 November

The all-new QCon app!

Available on iOS and Android