You are viewing content from a past/completed QCon

Presentation: Fault Tolerance at Speed

Track: Bare Knuckle Performance

Location: Pacific DEKJ

Duration: 1:40pm - 2:30pm

Day of week: Monday

Slides: Download Slides

Share this on:

This presentation is now available to view on InfoQ.com

Watch video with transcript

Abstract

Distributed systems providing fault tolerance often sacrifice performance. The sacrifice often happens late when a systems engineering approach is not taken. Performance is an inherent aspect of distributed design and should be considered holistically in the systems engineering process. A well designed distributed system can be both fault tolerant and fast.

In this session, we will discuss the techniques and lessons learned from implementing Aeron Cluster. Focus will be on how Raft can be implemented on Aeron, minimizing the network round trip overhead, and comparing single process to a fully distributed cluster. Come to this session if interested in how performance can be a first class design concern and the results which can be delivered.

Speaker: Todd Montgomery

Ex-NASA Researcher and High Performance Distributed Systems Whisperer

Todd Montgomery is a networking hacker who has researched, designed, and built numerous protocols, messaging-oriented middleware systems, and real-time data systems, done research for NASA, contributed to the IETF and IEEE, and co-founded two startups. He currently works as an independent consultant and is active in several open source projects.

Find Todd Montgomery at

Last Year's Tracks