Presentation: Fault Tolerance at Speed

Track: Bare Knuckle Performance

Location: Pacific DEKJ

Duration: 1:40pm - 2:30pm

Day of week: Monday

Share this on:


Distributed systems providing fault tolerance often sacrifice performance. The sacrifice often happens late when a systems engineering approach is not taken. Performance is an inherent aspect of distributed design and should be considered holistically in the systems engineering process. A well designed distributed system can be both fault tolerant and fast.

In this session, we will discuss the techniques and lessons learned from implementing Aeron Cluster. Focus will be on how Raft can be implemented on Aeron, minimizing the network round trip overhead, and comparing single process to a fully distributed cluster. Come to this session if interested in how performance can be a first class design concern and the results which can be delivered.

Speaker: Todd Montgomery

Ex-NASA Researcher and High Performance Distributed Systems Whisperer

Todd Montgomery is a networking hacker who has researched, designed, and built numerous protocols, messaging-oriented middleware systems, and real-time data systems, done research for NASA, contributed to the IETF and IEEE, and co-founded two startups. He currently works as an independent consultant and is active in several open source projects.

Find Todd Montgomery at

Similar Talks


Monday, 11 November

Tuesday, 12 November

Wednesday, 13 November