Venice is an open-source derived data platform developed by LinkedIn. It is used mainly for ML feature storage, which requires the ability to refresh data at very high throughput, and to look it up with low latency.
The lion share of the project is in Java, although it also leverages RocksDB and ZSTD via JNI. On the Java side, every bit of performance which can be squeezed is fair game.
After briefly presenting Venice, this talk then deep dives into some of the tricks we have employed in our relentless pursuit to lower read latency and to reach 1M operations per second per node.
Speaker
Alex Dubrouski
Technical Lead of Server Performance Team @LinkedIn
Alex joined LinkedIn in 2020 and since then has been working on optimizing server-side performance primarily for Java based applications. Alex also contributes performance patches to OSS projects like OpenJDK, Log4J and Venice.
Before LinkedIn, Alex spent 4 years working on performance and infrastructure at Pandora Media.
Find Alex Dubrouski at:
Speaker
Gaojie Liu
Senior Staff Software Engineer @LinkedIn, Open Source Contributor @Venice, a Massive Scalable Derived Data Platform
Gaojie joined LinkedIn in 2016 and since then, he has been working on Venice, a massively scalable derived data platform. He has been working on various aspects of Venice, such as new feature development, performance tuning and architecture evolution.
Prior to LinkedIn, Gaojie had worked in Yahoo for about 5 years, where he was mainly developing Yahoo Search Gateway platform.