Conference: Nov 5-7, 2018
Workshops: Nov 8–9, 2018
Presentation: You Can Build a World-Class Search Engine in .NET
Share this on:
What You’ll Learn
- Hear that .NET is fit for large server applications not just desktop ones.
- Find out what performance knobs to turn when writing a performance-oriented .NET application.
- Learn what tools to use to monitor the performance of a .NET application and how the GC works.
Abstract
Microsoft's online services, especially Bing, are some of most important proving grounds for running .Net in large-scale, highly available systems. The platform that underlies Bing also runs significant online functionality for Cortana, Office, Xbox, Windows and more.
When deciding how to build core infrastructure for the next version of Microsoft's query serving platform, we had to make a number of hard choices. First and foremost? Whether to use .Net or stick with C++.
This talk will discuss the ramifications of choosing .Net, why it was the right choice for us, and how much we had to learn about writing high-performance, high-availability software on this platform. We'll also hear about some of the myths we busted along the way, and why understanding them will help you apply these principles in your own software.
Interview
I still work a lot on performance and detail .NET stuff but my focus switched towards teaching others and getting other people on board with high performance .NET and techniques that we use to debug and and to improve performance.
I'm also working on making a lot of the technologies that we use here in Bing available outside Bing and hopefully someday outside of Microsoft. We built something pretty amazing, one of the largest .NET applications in the world, and the techniques that we used to get there deserve to be seen outside of our our little team. We work closely with the CLR team on a number of projects to improve latency in JIT and a bunch of other things, so I keep busy, too many things. It's always exciting. Pushing the envelope, finding the limits of the CLR and and then pushing beyond it somehow.
It's the number of assemblies loaded. It's one good measure, or the amount of memory it takes, the amount of load it takes, the number of instances. There might be .NET applications with more instances, but when you look at the whole package, talking to the CLR team as well, we're definitely up there with complexity and trying to push the boundaries of what's possible.
I'll probably talk about it in my talk but we load thousands of assemblies in one process. You can make an argument "Why are you doing such crazy things?" But there's reasons for it.
I want people to come away with an expanded vision of what's possible with managed code. A lot of people, when they think managed code they're thinking of smaller apps, LoB apps, and certainly you're seeing it. .NET is creeping into the operating system in more mainstream ways. From Windows Store you can download tons of things that are all .NET. But when you're thinking very large enterprise server applications a lot of people haven't considered .NET. And there's a lot of good reasons they should. There's been a bias traditionally against .NET, and to some extent Java, of performance. If you want real good performance you go C, C++, and stay as close to the metal as possible. And I want to change that perception.
We need more success stories of what is possible and I want to give people that success story in this talk, and describe some of the challenges we faced and the things we learned about the CLR and .NET, and get a little bit of the details of garbage collection and JIT. The big picture is to overcome this naïveness that I think a lot of people have about what is possible with .NET, and expand their minds and prove that it is possible to get great performance out of managed systems.
A couple of years ago I gave a talk that was hyper-focused on .NET. When I was putting this one together, I decided I didn’t want to repeat that. I want to talk about some GC issues but I mostly want to tell a story of the big picture of what we're up against, how we solve that. Here are some of the lessons we learned along the way. It is going to be a technical talk, but it is going to be focused more towards those people who are making decisions or at least have influence on their teams about what kinds of technologies they can choose and why they should use .NET. Why it's a good thing to go with .NET and go in with your eyes open. Nothing's free on any platform. There's costs, but there's different costs.
I usually compare against things like C++ because that's where we came from. Microsoft was all C, all C++, and the back-ends of our online tools were all C++. A lot of that has changed, not all of it of course. The big driving force in development in general today is agility, and balancing that with performance is one of the key problems. We have realized enormous agility gains by moving to a managed language. There's less time spent debugging, there's less time with live site issues because you don't have stupid crashes. All of that stuff goes away to a large extent and you can focus more on the performance. You have to do performance engineering anyway with C++. But now there's a lot less of just keeping the process running with managed code. People are more familiar with managed code just from a developer standpoint. This app is a platform for other developers and so making that easy for other people it was a big goal of ours and .NET just shines at that kind of thing. The APIs are easier, it's easier to plug into the .NET Framework than other arbitrary frameworks. I can't speak about Java. I know Java has a lot of the same advantages. But when I compare with legacy languages, there's a big, big advantage.
Yes. You're always trying to get good enough performance. A lot of the myth around .NET was if you want good performance you can't use .NET. "Yes, it gives you this agility and easiness. And it's an easy thing to get into, but if you want high quality engineered code you have to say 'No, it's not easy, .NET is not easy.'" You have to get over that mindset. Performance is hard. Engineering is hard. The language doesn't matter because in the end .NET is really just a machine code. That's what's running on your machine. It's always machine code. It's about the overhead of the services that the platform provides to you and how to manipulate those services and the behavior of those services to your advantage to program to them instead of fighting against them. I think a lot of people come into it, saying "I’ve got to fight against the GC, I’ve got to prevent GC, I’ve got to fight against it. I’ve got to fight against bounds checking and all these other things it’s adding on top." No, you don't have to fight that, you have to work with that.
I want them to walk away with two big things. First, is this idea that I was just talking about, that they can do large complex projects with .NET, and that they can do more than get away with it. They can thrive in it. They can be better than they otherwise would have been because there's a lot of ways to make it achieve their goals as far as performance goes, and agility and all the other things they have to balance. Also, if they do decide to go with .NET or they are working in .NET already and they want to improve their performance, here are some big gotchas that you're going to have to know about, you have to learn in detail about these areas. And you have to master these areas. You can't just walk into any platform and expect to get everything for free. You want to be engineers? Well, here are the engineering problems that are specific to .NET and how you can handle them and overcome them.
Yes, that's a good way to say it. Working with the CLR, or in particular the GC, it's like having a clock, it's like the gears in a clock. It's going to keep running the way it wants to run. And if you put a stick in those gears you can screw it up. You might break your stick if you don’t understand how those gears run. You might want to insert your own little gear but you want to keep the whole thing running smoothly.
I like your analogy to judo, where you're using the strength of this platform as your own strength. You are programming to the system not against the system. A lot of programming with .NET and getting the most out of it is understanding exactly how the CLR works. How GC works in detail. You really have to understand the implementation details. For many years those things weren't really highly publicized or even understood outside of Microsoft or even outside of the CLR team. I feel a large part of my job is to say "Look, here are the resources to understand how this clockwork mechanism works and you don't want to get in its way if you want to get the best performance." It's not universally true. You can come up with scenarios where it doesn't work perfectly in your situation. But we've seen a lot of success by taking that attitude: "How does the CLR work? How do we use that to our advantage?"
Yeah. And part of that is the domain of the application. If I'm writing graphics code or game code or these other things that are running in tight loops where I do need to care about register optimization or loop unrolling or very nitty-gritty details, then maybe I do need to go underneath the CLR. But I write for a search engine, and we have other things than just pure CPU performance. We have data back-ends, we have a lot of network, a lot of memory. We have to balance all these things, and often that extremely lower level performance is not the long pole in our application. For us it's more useful to concentrate on CLR and make sure we understand that so we can achieve our availability and performance.
Similar Talks
.
Tracks
-
Architectures You've Always Wondered About
Architectural practices from the world's most well-known properties, featuring startups, massive scale, evolving architectures, and software tools used by nearly all of us.
-
Going Serverless
Learn about the state of Serverless & how to successfully leverage it! Lessons learned in the track hit on security, scalability, IoT, and offer warnings to watch out for.
-
Microservices: Patterns and Practices
Stories of success and failure building modern Microservices, including event sourcing, reactive, decomposition, & more.
-
DevOps: You Build It, You Run It
Pushing DevOps beyond adoption into cultural change. Hear about designing resilience, managing alerting, CI/CD lessons, & security. Features lessons from open source, Linkedin, Netflix, Financial Times, & more.
-
The Art of Chaos Engineering
Failure is going to happen - Are you ready? Chaos engineering is an emerging discipline - What is the state of the art?
-
The Whole Engineer
Success as an engineer is more than writing code. Hear inward looking thoughts on inclusion, attitude, leadership, remote working, and not becoming the brilliant jerk.
-
Evolving Java
Java continues to evolve & change. Track covers Spring 5, async, Kotlin, serverless, the 6-month cadence plans, & AI/ML use cases.
-
Security: Attacking and Defending
Offense and defensive security evolution that application developers should know about including SGX Enclaves, effects of AI, software exploitation techniques, & crowd defense
-
The Practice & Frontiers of AI
Learn about machine learning in practice and on the horizon. Learn about ML at Quora, Uber's Michelangelo, ML workflow with Netflix Meson and topics on Bots, Conversational interfaces, automation, and deployment practices in the space.
-
21st Century Languages
Compile to Native, Microservices, Machine learning... tailor-made languages solving modern challenges, featuring use cases around Go, Rust, C#, and Elm.
-
Modern CS in the Real World
Applied trends in Computer Science that are likely to affect Software Engineers today. Topics include category theory, crypto, CRDT's, logic-based automated reasoning, and more.
-
Stream Processing In The Modern Age
Compelling applications of stream processing using Flink, Beam, Spark, Strymon & recent advances in the field, including Custom Windowing, Stateful Streaming, SQL over Streams.
-
Performance Mythbusting
Real world, applied performance proofs across stacks. Hear performance consideratiosn for .NET, Python, & Java. Learn performance use cases with OpenJ9, Instagram, and Netflix.
-
Tools and Culture: What's Beyond a Stack of Containers?
Containers are not just a techology. It's a platform. Push your knowledge.
-
Web as Platform
All things Browser, from JavaScript Frameworks for animation and AR / VR to Web Assembly and from protocol work to open standards evolution.
-
Beyond Being an Individual Contributor
Beyond being an individual contributor. Building and Evolving managers and tech leadership.
-
Building Great Engineering Cultures
Why engineering culture matters. Track features org scaling, memes as a culture tool, Ally skills, and panels on diversity / inclusion.
-
Hardware Frontiers: Changes Affecting Software Developers Today
Topics around: Quantum computing, NVM, SMR, GPU, custom hardware, self-driving cars, and mobile hardware.