Track: Bare Knuckle Performance

Location: Pacific DEKJ

Day of week:

Killing latency and getting the most out of your hardware - find out how to improve the end to end developer experience including design, development, testing and deployment. If you want to learn about state of the art performance approaches and solutions then this is the track for you. You can learn about how to achieve low latency, how to make the best use of your underlying hardware and how to solve problems on a global scale.

Track Host: Nitsan Wakart

Senior Software Engineer

The Perf Smurf™. An experienced performance engineer with decades of programming experience ranging from finance to commercial JVM implementations, Nitsan started writing software as a child and is unable to stop. A blogger, public speaker, open source contributor, instructor, JUG organizer and Java Champion, Nitsan is the lead developer on the JCTools project, the concurrency library of choice for Netty, DSE and many others.  When not plotting world domination, Nitsan enjoys piña coladas and getting caught in the rain.

10:35am - 11:25am

High Resolution Performance Telemetry at Scale

One of the most critical aspects of running large distributed systems is understanding and quantifying performance. Without telemetry it is challenging to diagnose performance issues, plan for capacity needs, and tune for maximum efficiency. Even when we have telemetry, the resolution is insufficient to capture anomalies and bursty behaviors that are typical in microservice architectures. 

In this talk, we explore the issues of resolution in performance monitoring, cover sources of performance telemetry including hardware performance and eBPF, and learn some tricks for getting high resolution telemetry without high costs.

Brian Martin, Software Developer @Twitter

11:50am - 12:40pm

Does Java Need Inline Types? What Project Valhalla Can Bring to Java

Inline/value types are the key part of experimental project Valhalla which should bring new abilities to Java language. It's a story not only about performance, but it's also a  story about safety, abstraction, expressiveness, maintainability, etc. But in this session we will talk about performance. Which performance benefits inline types bring to Java and how we could exploit it.

Sergey Kuksenko, Java Performance Engineer @Oracle

1:40pm - 2:30pm

Fault Tolerance at Speed

Distributed systems providing fault tolerance often sacrifice performance. The sacrifice often happens late when a systems engineering approach is not taken. Performance is an inherent aspect of distributed design and should be considered holistically in the systems engineering process. A well designed distributed system can be both fault-tolerant and fast.

In this session, we discuss the techniques and lessons learned from implementing the Aeron Cluster. The focus will be on how Raft can be implemented on Aeron, minimizing the network round trip overhead, and comparing a single process to a fully distributed cluster. Come to this session if interested in how performance can be a first-class design concern and the results which can be delivered.

Todd Montgomery, Ex-NASA Researcher and High Performance Distributed Systems Whisperer

2:55pm - 3:45pm

Bare Knuckle Performance Open Space

Session details to follow.

4:10pm - 5:00pm

JIT vs AOT Performance With GraalVM

In this session we are going to talk about various aspects of performance, such as peak throughput, startup, memory footprint and more, and how you can optimize your applications for them with GraalVM. GraalVM is a high-performance virtual machine, bringing new performance optimizations for individual languages and seamless interoperability for polyglot applications.

In particular, we’ll discuss JIT and AOT compilation and and talk about their advantages and trade-offs. Also, we’ll go through some practical examples comparing JIT compilation with HotSpot and GraalVM, JIT and AOT, improving AOT with profile-guided optimizations and more.

Alina Yurenko, Developer Advocate for GraalVM @Oracle

5:25pm - 6:15pm

Parsing JSON Really Quickly: Lessons Learned

Our disks and networks can load gigabytes of data per second; we feel strongly that our software should follow suit. Thus we wrote what might be the fastest JSON parser in the world, simdjson. It can parse typical JSON files at speeds of over 2 GB/s on single commodity Intel core with full validation; it is several times faster than conventional parsers.

How did we go so fast? We started with the insight that we should make full use of the SIMD instructions available on commodity processors. These instructions are everywhere, from the ARM chip in your smartphone all to way to server processors. SIMD instructions work on wide registers (e.g., spanning 32 bytes): they are faster because they process more data using fewer instructions. To our knowledge, nobody had ever attempted to produce a full parser for something as complex as JSON by relying primarily on SIMD instructions. And many people were skeptical that a full parser could be done fruitfully with SIMD instructions. We had to develop interesting new strategies that are generally applicable. In the end, we learned several lessons. Maybe one of the most important lesson is the importance of a nearly obsessive focus on performance metrics. We constantly measure the impact of the choices we make.

Daniel Lemire, Professor and Department Chair @TELUQ - Université du Québec

Last Year's Tracks

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.