You are viewing content from a past/completed QCon

Presentation: Quantifying Risk

Track: Ethics, Regulation, Risk, and Compliance

Location: Pacific LMNO

Duration: 2:55pm - 3:45pm

Day of week: Monday

Slides: Download Slides

Share this on:

This presentation is now available to view on InfoQ.com

Watch video with transcript

What You’ll Learn

  1. Hear why is important and helpful to have a quantitative evaluation of risk.
  2. Listen how Netflix is using the FAIR methodology to quantitatively evaluate risk.
  3. Learn some tips and tricks from Netflix’s experience.

Abstract

The FAIR methodology is an emerging standard for measuring information risks. But, it can be intimidating to get started with a risk quantification program, as people may be reluctant to to go beyond Low/Medium/High categories to real numbers. At Netflix, we have introduced risk quantification in our highest impact areas, and are gradually expanding it across the enterprise. I'll share my experience and approach to defining appropriate loss scenarios, and getting real numbers from colleagues.

Question: 

What is the work you're doing today?

Answer: 

I work as a Detection Engineering lead at Netflix and we have a fairly unique philosophy around how to do detection and response: we call it a SOCless organization. We don't have a large number of people sitting there watching alerts. Instead what happens is each application owner is responsible for their own security, and we help them. Our response team is mostly an escalation point when there's an incident. And my role in detection is two things: one is to address the highest risks in the organization and to build comprehensive detection around those. The second is to build capabilities that can scale out so that all the individual teams can enable their own monitoring.

For the former goal, it was very important for me to understand what are the highest risks and what are the loss scenarios that we need to address with detection. That's where this risk quantification effort came in. That was an effort to understand and break down our risks in such a way that I would know what are the actual scenarios and then address those with various detection efforts, and also as I study the detection I often uncover additional controls that would make sense. Engaging with those teams to get them to adopt new controls while we also build detection for the things that can't be completely locked down. That's my team's role.

Question: 

What are your goals for the talk?

Answer: 

My goal is to communicate the experiences that I had applying these risk quantification approaches. There's a methodology and an organization called FAIR which is an emerging standard in the sense that it's being pushed as a standard and a lot of people are finding it useful but it has not gained wide adoption yet. People I think are still afraid of quantifying risk in this way. A lot of information security teams tend to just use the low/medium/high or some one to five scale, it's more categorical than quantitative. But there are a lot of good reasons why you want to get more quantitative and more numerical. I want to explain why you should do this, and overcome some of the fears of why it may not work or the difficulties of applying these approaches. Having done it myself I can alleviate some of the fears. I'll explain why the quantification is important and walk through the tips and tricks that I learned while applying these approaches.

Question: 

Can you summarize FAIR?

Answer: 

Sure. The basic approach is that you want to come up with two numbers. One is the frequency with which some loss could occur, the frequency with which you expect it to occur. That's just a number saying how many times per year would you expect this loss. Hopefully, it's less than once per year, in which case it becomes a fraction. If it's every 10 years then you would say 0.1 Is the frequency. 

Then the second number is actually more of a distribution of impacts and the quantification there you put it into actual money terms, in our case dollars. And you set a low to high interval, and then typically they model it with a  lognormal distribution. 

By combining these two you can calculate an expected loss per year, and that lets you prioritize all of your losses and say which ones are the most important. Something that's not frequent but very high loss can be compared to something that's more frequent but low loss, and you can rank them.

A second thing you can do is to actually calculate a" loss exceedance curve" which is a picture of what the chances that the loss would exceed some amount. This is what insurance companies use for example to insure a building against hurricane damage. They'll say, well, based on historical data here's the range of different kinds of wind conditions that we might expect and the amount of damage that would occur to the building and we create a probability picture where we expect the loss will exceed a million dollars with 10 percent probability. And at that point now you have some quantities you can use to buy insurance against that loss. You can say, OK my losses definitely won't exceed X, because if they do the insurance company will pay, but I have to pay them based on the level of risk. 

This is where the industry wants to get to with IT losses, they would like to be able to quantify them so they could insure against large losses. But it's going to have to be a bottom-up approach, and the community has to adopt practices that are actually common in other industries. The FAIR methodology is actually a bit simpler than what other industries use but it's a good way to introduce this approach of quantifying risks and putting numbers to them.

I think it has great benefit also for information security teams to explain to their management what the value they're providing. I give the example of if there is a department that comes to the CEO and says, hey, if you give me a million dollars I have an investment I can make that will return five million dollars. And as the information security team comes and I says, if you give me that million dollars I can turn this red risk into a yellow risk. Which one is the CEO gonna want to put the money to? You can say I'll reduce this 10 million dollar risk to a 2 million dollar risk, in which my rate of return is higher than the other person's. That argument is much easier to make, and unless we get to that point it's very hard to make those business cases.

Question: 

Where are some of the places you've applied this at Netflix?

Answer: 

I won't be able to get into too much detail about the areas that we applied it in and what the actual risks are and stuff. Essentially we've been applying it to our highest risks: what are the most sensitive data stores that we have, and what are the threats against those data stores.

I'll be presenting the process in a generalized way that people will be able to follow and apply within their own environment. Those risks can be different depending on what your business is and what kind of sensitive data you tend to store. 

One thing that everyone stores, for example, is private information about their employees, and that's certainly an asset you want to protect. But how does that balance against the types of user data that you're collecting? It comes down to how sensitive the user data is, and how much of it there is. Getting some quantities around these things is very important, then you can allocate your time appropriately to maximally reduce your risks.

Question: 

 What do you want people to leave the talk with?

Answer: 

I want them to leave the talk with less fear around putting numbers to things, and with a sense that they could go back and apply these approaches, that they have some practical steps that they can follow in order to actually execute on a quantitative risk analysis. The idea of quantifying risk has been around for a while, but the adoption is difficult because people aren't used to doing it. I want to show that it's not a terribly complicated thing and that they would have most of the skills already.

Speaker: Markus De Shon

Sr. Security Engineer, Detection Engineering Lead @Netflix

Markus has worked in security since 2000 at SecureWorks, CERT, Google and Netflix, mostly on problems in Detection Engineering. He has a passion for developing a comprehensive framework to guide the engineering of detection and response systems, an effort that he has written about and continues to work on today.

Find Markus De Shon at

Last Year's Tracks

  • Monday, 16 November

  • Clientside: From WASM to Browser Applications

    Dive into some of the technologies that can be leveraged to ultimately deliver a more impactful interaction between the user and client.

  • Languages of Infra

    More than just Infrastructure as a Service, today we have libraries, languages, and platforms that help us define our infra. Languages of Infra explore languages and libraries being used today to build modern cloud native architectures.

  • Mechanical Sympathy: The Software/Hardware Divide

    Understanding the Hardware Makes You a Better Developer

  • Paths to Production: Deployment Pipelines as a Competitive Advantage

    Deployment pipelines allow us to push to production at ever increasing volume. Paths to production looks at how some of software's most well known shops continuous deliver code.

  • Java, The Platform

    Mobile, Micro, Modular: The platform continues to evolve and change. Discover how the platform continues to drive us forward.

  • Security for Engineers

    How to build secure, yet usable, systems from the engineer's perspective.

  • Tuesday, 17 November

  • Modern Data Engineering

    The innovations necessary to build towards a fully automated decentralized data warehouse.

  • Machine Learning for the Software Engineer

    AI and machine learning are more approachable than ever. Discover how ML, deep learning, and other modern approaches are being used in practice by Software Engineers.

  • Inclusion & Diversity in Tech

    The road map to an inclusive and diverse tech organization. *Diversity & Inclusion defined as the inclusion of all individuals in an within tech, regardless of gender, religion, ethnicity, race, age, sexual orientation, and physical or mental fitness.

  • Architectures You've Always Wondered About

    How do they do it? In QCon's marquee Architectures track, we learn what it takes to operate at large scale from well-known names in our industry. You will take away hard-earned architectural lessons on scalability, reliability, throughput, and performance.

  • Architecting for Confidence: Building Resilient Systems

    Your system will fail. Build systems with the confidence to know when they do and you won’t.

  • Remotely Productive: Remote Teams & Software

    More and more companies are moving to remote work. How do you build, work on, and lead teams remotely?

  • Wednesday, 18 November

  • Operating Microservices

    Building and operating distributed systems is hard, and microservices are no different. Learn strategies for not just building a service but operating them at scale.

  • Distributed Systems for Developers

    Computer science in practice. An applied track that fuses together the human side of computer science with the technical choices that are made along the way

  • The Future of APIs

    Web-based API continue to evolve. The track provides the what, how, and why of future APIs, including GraphQL, Backend for Frontend, gRPC, & ReST

  • Resurgence of Functional Programming

    What was once a paradigm shift in how we thought of programming languages is now main stream in nearly all modern languages. Hear how software shops are infusing concepts like pure functions and immutablity into their architectures and design choices.

  • Social Responsibility: Implications of Building Modern Software

    Software has an ever increasing impact on individuals and society. Understanding these implications helps build software that works for all users

  • Non-Technical Skills for Technical Folks

    To be an effective engineer, requires more than great coding skills. Learn the subtle arts of the tech lead, including empathy, communication, and organization.