********************************************************************************. The following talk transcript was recorded at QConSF 2017. The text And its copyright remain the sole property of the speaker. **************************************************************************************** >> Welcome to day two of QCon San Francisco. Please welcome back the chair of QCon San Francisco, Wes Reisz! (Applause). >> Woo! So, how was day one? That was weak, c'mon. How what's day one? [Cheering]. >> All right, today is going to be another great day. Any ideas on what the best talks were from yesterday, what did you think? >> Matt Abram's. >> Josh Evans. >> That was great. Yes, here are the actual numbers. If you look on the website, these are up there. The top-attended talks were architecting a modern financial institution, the fourth talk was making a bigger impact, important skills that matter. The number one attended was Neil Williams, evolution of Reddit's architecture. Those were the attended. And now, the rated talks, the top -- Matt Abraham's two, and the second was one of the AMA's the experiment worked out well, 100 percent green. Pretty happy about that. There will be three more AMAs today around communities so you can ask questions specifically, in this case, it is RBC and also Spring, so different things about Spring. And feedback, the way that you can see what feedback is that you -- what you rated a session, you can go to the schedule to see it. You can also find the slides and transcripts when uploaded, you should be able to get those there. If you are blogging or tweeting, these transcripts are great things to get exact quotes from today. Our users are using them a lot about blogging about things when they have transcripts. The video, I will talk about in a second. The feedback, if you go to the My Votes page, or my accounts, and then my votes, you can see where you went to, change, and provide your feedback. And for the speakers, please give feedback. If you voted yellow, vote a yellow or red if you feel that way, but give context so they know what to do better on the actual talk. They put a lot of time into that, they want to go what you think. Give them feedback. And a re-cap, red, yellow, green. Green is positive, and yellow/red are variations of negative. Green, the talk met or exceeded your expectation, this was a good talk. And yellow was a talk that was missing something, and that's a good thing to provide feedback for, you missed this, you should cover this, and that will make the talk more impactful, and red is a miss. And one things these are not, don't say the room is cold, the AV has issues, or things like that. This is the content, the speaker. If you have issues with the room, it is cold, the temperature, find me or someone else with the staff so we can get that fixed. There is somebody in every room, they are on a WhatsApp channel, please communicate with them, but not the speakers. Yesterday's videos are not up yet, we are running just a little bit behind, they will be up today. You will get an email about it, and then you will also get a message on the My Account page that gives you an access code to get to all the videos. And I think there was one video that was not recorded yesterday that had been planned. But the rest of them were all recorded. And, upcoming QCons. So we have QCons all around the world, every single one of our QCons are individually organized by different people, we have different committees that plan and organize them. If you go to Qconferences.com, you can see what is coming up. And if you will give me a moment for a humble brag, we are excited about this new conference, our first new conference in six years. And it will be here in San Francisco in April, QCon AI. What QCon AI is, it is not a typical conference for data scientists, or machine learning engineers, it is for SWEs that want to apply this in their jobs, what to do with this data that is coming at you, it is focused on the software developers. I will talk about it later, but I wanted to do more to talk about this. We are really excited about this conference. If you come back to a conference, we have an alumni program, you get a t shirt, gifts, and invites to the speaker dinner party that we have on Tuesday nights if you keep coming back. And social media, you probably figured out by now, QConSF is our hag tag, if you blog, look for the transcripts, they are good sources for quotes, it is amazing how helpful they are. And also on Facebook, QCon is where we live. And all right, next up, I will bring up Charles Humble, the chief editor of InfoQ, he will talk about InfoQ.com. >> Thank you so much. Good morning. (Applause). >> Thank you. So, as Wes Said, I'm Charles Hummel, I'm InfoQ EIC. Does anybody go what that building is? It is the Smithsonian institute, the headquarters of the Smithsonian. So it was founded in 1846 for the increase and diffusion of knowledge, which is one of the world's best corporate slogans, I think. That is not why I have it on the slide. I have it on the slide because I was here last year at QCon, and I was speaking to an attendee, and I asked him how his QCon experience was, he said it was brilliant, which is what we like to hear. They said, you are like the Smithsonian institute, but you are only here for a week. I said, that's fantastic. I should work it into presentations. And it does beg the question, if you are only here for a week, how do we diffuse and increase knowledge for the rest of the year, or accelerate human technological progress, which is also a good company slogan. The answer is InfoQ, we have the website, that's what it looked like in 2006. The design has improved since then. The major topics were in Java, dot net, Ruby, SQL, and architecture. And some of these are still very much the main stage of the site. We do not do as much Ruby or Agile, and we do a lot on diversity and so on. We do microservices, and you can make a case for saying that microservices are a re-branded SOA. And we have added a few things along the way, DevOps, we started writing about in 2012, it was hard to date it, because we were writing about it before a term was coined. Microservice, 2014. And then, since I became chief editor, we had a push on machine learning and AI and IoT. And when we are thinking about what we want to write about, we use one of these quite often. So this is a technology life cycle graph, you may have seen this on the QCon website and on the posters around the site. And what we do, I have a team of 50 people that write for me on a regular basis, we get groups of those together, they are, like the QCon speakers, practitioners and engineers, we get groups of them together and we ask them what they are seeing in their field and we use this in the conversation. So just to show you how this might work, I will take the topics I mentioned and put them on the graph so you can see. If I were to put Java and .net on here, I would have it at the majority. That is controversial, and that suggests that Java is declining, there is some evidence that is true, there are alternative languages, like Go, or Rust, or Server side too swift. So maybe a little bit. So I'm not, for one second, implying that Java is going away, but there are other alternatives. And I will not read the whole slide, we will be here all day, but there's a lot of stuff that is happening within the Java space. I have Java 9, and whatever is after Java 9, which might be 10 or 18, or whatever it turns out to be. I have the stuff in stuff with objects, Open J9, and so on. It is a big box, there's a lot going on.. And we will zoom up one and go to the left, there I have DevOps. And DevOps is reaching toward that mass adoption stage, and if I keep going further, I have microservices and containers, and then, as you would expect, some of the sort of enablers for microservices and containers in the large are further along. If you think of Kubernetes, or container orchestration, or service measures, that is a little bit behind, what you would expect. And on the left hand, I have machine learning and AI. What is interesting about this, you might be sitting here and thinking that this man is an idiot and has no idea what he is talking about. I certainly get that suggestion on Twitter from time to time. (Laughter). The thing is, exactly where you are on this adoption curve depends on where you work, what company you are in, and what country you are in. If I were doing this in London where I'm from, and for a hedge fund, machine learning is not at the innovative stage in hedge funds, they have been doing this for 20 years, it would be a majority by now. And DevOps, no, frankly, you don't generally see financial institutions deploying stuff to production, too many things go badly whether they do that, there are variations but that is generally true. And this man wrote the recognition of Virtual Light, if you have not read his books, do. He said that the future is already here, but it is not evenly distributed. It depends where you are. And hopefully, if we are doing our job well, and you are coming to the QCon events and you are reading the website, you should always feel like you're one step ahead of where you need to be, that you know what is coming. So I'm really excited about the site, and you can get involved. You can write for us, I would like to encourage you to do that. And that -- we have some of these on The Infobooth, you can pick them up, I'm on Twitter. And if you would like to speak with me, I would like to speak with you. At the risk of sounding like Jerry Springer, I will leave you a final thought. This is the most exciting time I can remember in software. I have worked in it for more than 20 years, and this is an extraordinary era we are entering into, we have limitness compute resource, we have extraordinary open libraries and amounts of data we can do extraordinary things with. It is an era not without its risks. And I think that the most important thing we can do as an industry to mature and grow up is to share our knowledge. And what I really want to encourage you to do is just to share what you have learned along the way, however you choose to do that. So you can write for InfoQ, that would be lovely, and you can write for a website, or a blog, I don't really mind. Speak for QCon, sure, but speak for a conference, meetup, or whatever it is. And in sharing your knowledge, you are helping us in our mission in fostering the software-side of human technological progress. And you are helping the industry progress and maybe avoiding some of the mistakes and pit falls along the way. So share your knowledge, thank you. (Applause). >> Thanks, Charles. And a few notes about the sponsors, fantastic sponsors. Some of the ones I did not mention yesterday, Aerospike, AppDynamics, Coursera, go and visit them. If you do, this is what they are giving away with the raffle this evening. After the technical sessions this evening evening from 6:15 to 7:00, there will be a session and raffle to give out these prizes. One sponsor track, and just to reiterate, this is the only place where we allow sponsored pitches in the talks. These are from sponsors, it does not mean the content is not great, but it is coming from vendor, it is not curated by us. So there's a distinguishment between them, they are labeled sponsored. And the rest of the talks are curated by our track. And one that I'm interested in here is the CEO of LaunchDarkly, and then also online, the Jonathan Hseu one, practical microservices, has been seeing a lot of interest. So we will have the track hosts to give you their vision for the tracts. We will start with Randy Shoup. >> (Applause). >> Thanks. Super excited to be doing the microservices patterns and practices tracks, a lot of practical advice for people that are using microservices. And first, we will have Roopa Tangirala with Netflix, she has been working there for 10 years, has been a DBA in her career and is a Cassandra rock star, and will talk about persistence, a super interesting and important topic in microservices. So don't miss that one. And after that is Rafael Schloming, the co-founder of Datawire, he has a wonderful idea which he calls service-oriented development, it is using microservices as a lens to help rethink the entire development process, that's a great one as well. And I am going to give a talk on managing data in microservices, some ideas from StitchFix where I work now, from eBay and Google where I used to work on doing things you would love to have from your monolithic database, like joins and transactions, how do you get those in microservices, that's what we're going to talk about. And I think I did it in the long order, Louis Ryan from Istio is going to talk about a service mex that sits beside the microservice that does load balancing, security, and management without touching your code. It is a powerful methodology that Google has adopted internally, conceptually, and he will talk about how you can use that. And last, but not least, Chris Richardson, a friend of the conference, he is writing a book on, I'm not kidding, microservices patterns. He is going to talk about using sagas as a way of doing transactional-like things in microservices, which is an excellent talk. I hope you see that. See ya. (Applause). >> Hi, I'm Dave Copeland and hosting the engineer talk. And breaking a monolith to serverless architecture is probably really hard, but not as hard as being a real human person that is self aware and able to interact with other people while doing your job as a programmer. That's what we are talking about on the engineer track. If you are interested in why learning about brilliant jerks are bad teammates, or if you are, listen to Justin Becker. And next, Anjuan Simmons will talk about the agile manifesto, found a framework for influencing customers and code. Kevin Stewart will be giving a talk after that, we know that diversity is critical to building a functioning engineering team, and he is going to talk about another critical part, which is inclusion and how can we commit to actively including people that we bring on to our teams as they grow. And I will be giving a talk on how to work with people that you never see face to face, how to build and maintain trust as a remote engineer. And lastly, Randy Shoup will be giving a talk that is two and a half years in the making, will have a lot of personal stories in there, and it will focus on the growth mindset, and how to build confidence, how to build trust. And, most importantly, it is going to be a lot of really good suggestions on how to achieve success in your personal and professional lives. Check that out. That's it, I hope to see you on The Whole Engineer track. (Applause). >> And I'm running the Java track. We know its 21/22 year history, we will talk about how it is evolving, and we are going to start with Roman Elizarov, he will talk about his journey to async, how to reduce its friction to developers, and the path that they use with it. And from that, we will go to Ramki Ramakrishna, and he is from Twitter, he will be talking about how AI and ML is introduced into the data center. For example, if you are a Java developer, think about how much time you are tuning JVM. Twitter is using optimization on how to tune their environment. And Tal Weiss is talking about how you can extend your instrumentation into serverless so you can understand what is happening. And Rossen Stoyanchev, he is a commiter to Spring, he is doing a talk on choosing between serverlet and reactive, what decisions you should make and what the trade offs are. And the last talk of the last talk before the open space will be Bernard Traversat, the head of the Java platform development team, VP at Oracle, he will talk about getting to 9. He will talk about tips and tricks for migration, easing the path to get there, and he will talk about the six-month cadence, what is that is going to look like and how it is going to look. So that's Bernard Traversat, and we will wrap up with an open space. (Applause). >> Howdy. I'm Tyler Akidau, and I'm hosting the stream processing in the modern age track. This track is about getting realtime insights from your data, building streaming applications that can power, you know, platforms and also power business insights. So we have an awesome line up, Matt Zimmer from Netflix, he will talk about custom windowing with Apache Flink and solving problems. And we will have Serhat Yilmaz, talking about stream processing in Facebook, and building them quickly and with minimal DevOps pain. And then we have Vasia Kalavri, she is a PMC member of Apache Flink, and she is talking about predicting and simulating data center behavior in realtime. Next, we have Stephen Ewen and he will talk about apache Flink Applications. And next, me I will talk about streaming SQL, and how to think about SQL when you are dealing with streams. And lastly, we have a panel with a fantastic set of folks on it. We have Julian Hyde, the original developer from Calcite and a cofounder from SQL streams, and now an architect. We have Jay Kreps, we have Michael Armbrust, the author of spark SQL, and then we have Steven and myself as well. So please come and join us and learn about realtime data processing. (Applause). >> Hi, good morning, I'm Jerome Petazzoni, I work at Docker and I'm the presenter of the container track. Two years ago, I could have done that Steve Banon style on containers, containers, containers. And today, if I have to show that this track is worth your time, I will talk a little bit more. We will delve into the world of microservices and containers, maybe you have seen that, we tweet about, oh, since we migrated to microservices, every time is like solving a murder mystery. To help with that problem, we need new tools and debugers and Idit Levine is going to talk about Pratt. He is going to talk about who contains the containers, not quite. And to talk about that, Kris Nova will present cube con, a tooling framework for the clusters and the complications that come with that. And then, we are going to talk about from A to Z, how can we have clusters that span multiple architectures from IBM main frames to Rasberry pis, and the point in that is stacking in problems like IoT, edge computing, and things like that. Brett Fisher is going to talk about how to do container stuff in production, this will be a treasure-trove of tips, tricks, and things that are learned the hard way. It does not have to be the hard way, so you can just come for the talk and hear the tips. And then Tim Tiler will talk about digital transformation when you want to embrace containers, what does it look like for a company to go in the container journey, and we will have an open space to talk about all the things related to containers. Thank you. (Applause). >> Hello, wow, I have to try to erase that photo of me off the internet. I'm Marty Weiner, I was CTO of Reddit and with Pinterest, and I miss hardware! I wonder if there is a CPU beneath us, and I formed this track to bring us closer to the metal, to talk about the latest in hardware advances that affect us all. Jeff Dean will talk about how to bring your hardware to production, and not just how to do it, by why you should. I'm excited about that talk, I have prototypes sitting at home I have not done that. We have William Roscoe and Adam Conway, who are talking about donkey cars, they are going to build one and run it in front of you. These are $200RC cars that are self-driving, a neural net on top of a Rasberry pi. And I'm of the mindset that Rasberry pis are becoming an abigitous platform, and so ubiquitous that we can put it on top of everything that we do. And next, we have Michael Bunnell and David Redkey from MZ, they work to get every last drop out of the graphics hardware. They make games, if you don't make games, if you make any mobile app, you understand how important it is to get every last drop out, especially the last one percent. If you make mobile apps, attend the talk. We have Magnus from Google, about pushing the problems from ML. He is solving problems that were not tractable a few years ago, he is talking about the latest in TensorFlow and using TensorFlow hardware acceleration, TPUs. Super exciting. And Daren Cruz is talking about something that I just learned about, the modivious vision processing unit, these are too cool, like a neural net on a chip at the edge. As the data comes in from the camera, on the new piece of hardware or other sensors, you can process it with a neural net and react to it immediately. And unless he gives me a free sample is, I'm going on mouseer tonight. Amitabha is going to talk about -- think about if your machine goes down, when it comes back up, your RAM has state in it. This will talk about everything everything from data, databases, and logic. This is a very important talk. I'm super excited and want to get started now. Thank you. (Applause). >> These are the talks for the day, the track host, one more round of applause, these guys did an amazing job. Thank you. (Applause). And I want to talk real quick about the AMAs that are going on today, I mentioned we will not have one after the keynote today. There was a conflict that came up. We will have three other ones, though. And so Randy talked about Istio, one of the main creators is Ryan, he was involved with gRPC, that will be a gRPC AMA. And I asked Rosley to bring some to the AMA, and all three of these people will be here to talk about on Spring, so check that one out. And then Idit and Tal will talk about on debugging AMA, if you have questions on diving deep to debugging, from containers to server less, this is a great way to get these questions answered. These are the AMAs for the day. And day two, I I did it yesterday, I will today, 32 talks, actual and practitioner based, there are three AMAs of different areas of interest within the communities, two open spaces, and how were the open spaces yesterday? What did you think? Did they go well? (Applause). >> Well, they are going to go better today! All right, three panels to ask from experts, like the data engineering panel that Tiler talked about, that is an amazing panel. We talked enough. I will introduce the keynote. So this morning's keynote is someone who has worked at Apple, at Google, he has worked at Twitter and, most recently, he came from Slack channel. I'm talking about Leslie Miley, and Leslie is going to do a talk today on AI, ML, and the inherent bias within the data sets. He is going to touch on the things in the news, social networks and the zer row zero day accounts, and how underrepresented groups can be hurt by AI and ML, and what can we do about it. Please join me in welcoming Leslie to the stage. (Applause). >> Good morning, everyone! This is a big crowd. I'm really surprised we had this many engineers up this early, including myself. (Laughter). So, before I get into this, a couple of things I have to do. I have to do a selfie, this is the age of social media. And so I'm going to just do a selfie with all of you, if you don't want to be in this picture, you can leave now. (Laughter). Anybody leaving? Seriously, I'm doing a selfie. Don't worry, I will only get the top of my head. Look, I got photo-bombed. This has been an unprecedented year for a lot of reasons, and it has been unprecedented because we have seen so much of social media being powered by bias, powered by ML, powered by angst, powered by fear. It has been one of the most tumultuous years that I can remember, politically and socially. And I started to think about the part that we all played in that. So I'm going to go through a little bit of it now. So Facebook, in 2016, said fake news wasn't a problem. Right after the election, they said, fake news is not a problem on our platform. By October of 2016, they were like, eh, 10 million people saw ads that were fake or Russian-linked, and by November, that number went to 126 million. And by the end, I suspect that they will say everybody saw them. And I think that would probably be accurate. We have all seen information that's been fake, or false, or propaganda. And, you know, we have gotten so used to looking at ads, most of us in this room, that we really don't even notice them. But your friends, your parents, people who are outside of tech have probably sent them to you, and asked what is going on, but you don't think anything of it. You are just like, yeah, people send this information to me all the time. So when Twitter went on Capitol Hill, they were like, yeah, we have 6,000 Russian-linked bots on Twitter that generated 131,000 tweets from September and November 2016, a lot of tweets, with 288 million views. We have 68 million viewers in the United States, so every user saw that. So how does that happen, how does something that you and I can see that we know is propaganda and is fake and false, how did that get by and pushed to tens of millions, hundreds of millions of people? I will give you a little bit of my history in this. At Twitter, I ran abuse, safety security, and the accounts team. During an investigative session, we discovered hundreds of millions of accounts that were created in Ukraine and Russia. Hundreds of millions. This was in 2015, I thought, I don't know what these are for, they are probably not good, I don't know why they are here, we should get rid of them. But I left Twitter and I don't know what happened. I would like to think they got rid of them. (Laughter). I suspect that they didn't get rid of them all, because we see what has happened. And once again, I'm like, what happens if you have hundreds of thousands, millions, tens of millions, hundreds of millions of these accounts on Facebook, on Twitter? Twitter, excuse me, Facebook just came out and said 200 million of their accounts may be false or fake or compromised. Yes, that's only 10 percent of their active users, I know that doesn't sound like a lot, but that is 200 million. There is a problem that we're just, like, not addressing, and we're going to dig a little bit deeper. I think that's the tip of the iceberg. In 2016, Twitter came out with their algorithmic timeline. Facebook has been doing this, Instagram is now doing it. But Twitter said they wanted it to ensure that you saw more tweets from the people you interact with. And they also said, this is a quote from Twitter, ensure the most popular tweets are far more widely seen than they used to be, enabling them to go viral on an unprecedented scale. I say mission accomplished, they did a magnificent job of creating a timeline. Facebook has done a magnificent job of creating a timeline, showing you, your friends, and your family the most popular tweets. But there's a problem with that. They are media publishers, whether they want to believe it or not. More people see information on Facebook than they see on the New York Times, CNN, MSNBC, and Fox combined. They are a publisher, and a publisher with no accountability, none. They publish it and they say, we're the platform. The system didn't deliver news, the system delivered propaganda. The system didn't deliver your cat videos,thy delivered biased information. They told people to go out and protest against Black Lives Matter, they told Black Lives Matter people to go and protest against something, they told somebody to go and shoot up a pizza parlor in the middle of the country because the DNC was running a pedophile ring, somebody did that because of fake information they got from social media. This concerns me, and I ask this question, what if there were hundreds of millions of accounts sharing compromised information, what if they were tweeting, sharing it on Facebook, what if it was on Instagram? What do you think these systems, like Facebooks or Twitter's algorithmic time line would do with all of this? They would take it in and say, this is being shared a lot, I will share it with more people who like this type of stuff, who like this content. And yes, it is not the people in this room mostly, it is probably people, just your friends and family who send you these things, I saw this, is this true? I get this all the time from my family. Is this true? I don't know why you think this headline is even remotely true, but it looks true. And between Twitter's 100 and something million, and Facebook's, and this number is in dispute, potentially over 700 million accounts on their platform, you have a billion accounts that could be sending false signals into these systems. Signals that take advantage of their algorithms, take advantage of our bias, and get us to think different things, to vote different ways, to talk to people in different ways. And Facebook did a great study, if you want to call it great, in 2014, where they started introducing different types of information into people's timeline to see if it affected their moods. It did, people would post different things. People would read different things. It would actually change what they were doing. And my thesis is that, once they found this out, they published it, it went out there. My question was, did they do anything to stop anyone else from doing that? I think we know that answer today. It is a frightening world when you can reach hundreds of millions of people with data that is, and information that is wrong, information that is propaganda, and influence their moods. And the funny thing is, they didn't see it coming. Twitter didn't see it coming, Facebook didn't see it coming. And they actually stood up and said it wasn't a problem, until they started looking into this. Does this start to sound vaguely familiar to anyone? I mean, would all of this information be part of the training data that determines what you see in your timeline? It is. Every day. It is part of the training data. And it concerns me because it was hailed as the Next Big Thing, bringing relevant content, and targeted relevant ad serving. These systems were deployed at mass and scale, and worked with little human input or insight, showing people what they wanted to see, whether it was true or not. And that is not a world I really like living in, personally. It is a very scary world. And so, Facebook, hey, I have to give them credit, Facebook said, we're going to hire 20,000 people to tackle fake news. 20,000 people to tackle fake news. One, is there that much fake news, and two, do you need 20,000 people? Twitter is determining how they view the ads to you, but they never said they are going to change anything, they are just going to throw people at this problem. And as I was preparing for this, I thought, why is this resonating with me in a way I could not figure out? I had to do a lot of reading and thinking, I said, this is shades of the mortgage crisis. This is shades of people taking a bunch of information in, chopping it into little pieces, feeding it out to a hungry public, and not really understanding, after a long enough time, how the system even works anymore, and why it works that way, and what it even in this system, and how it is generating its outputs. Banks are trying to understand the risks they had after the 2008 crisis. They had to hire people to actually look at every mortgage to understand their exposure. They had to look at every piece of mortgage data that they had chopped up and thrown into a CDD, thrown into a CDO, and a lot of the banks just threw their hands up and said, we are going to write off some number and let the market it come back and not worry about it. Which, lucky for them, it happened. It was interesting because, Warren Buffet called CDDs, and I'm sure everyone in here knows what it is, it is collateralized debt something, something or other. I don't know even know what it means. (Laughter). It is a mortgage that is chopped up, and is bundled and sold as security that is rated as AAA, but they weren't, because nobody knew what was in them, nobody knew how it worked or operated, and when it came crashing, everybody was holding the bag but nobody knew what was in the bag. So, why it concerns me is the Next Big Thing will be an AI ML company. It may be Google getting bigger, Facebook getting bigger, it may be something that we don't know. It may be something that one of you in here are going to end up creating. And I wonder if we are just going to repeat the mistakes of the past. I don't know. I hope not. So, I have explained some of my concern. I have explained why this is a problem. If anybody wants to talk to me after, catch me out the door, I'm going to run out the door so I don't have to defend any of this. (Laughter). The reason that we have look at this now, more than ever, is that there's a growing and thriving industry growing up around this. These models are being applied pretty much everywhere, everywhere. They are being applied to self-driving cars, they are being applied to ride sharing. Imagine that, you know, Uber or Lyft or some other ride-sharing company determines that a certain neighborhood, their rides are always under $5. Are they going to send people to pick up there? Probably not. Or, they are going to send people who are lower-rated. This is happening now. What does that do for impoverished people, what does that do for people who are not advantaged? It is just like redlining in the '40s and '50s, this is happening because nobody is looking at the data, where the data is coming from, there is no transparency in how the algorithms are put together. I will get a little real. This is happening in sentencing guidelines. Propublica did a great article on this, they decomposed someone's algorithm, and the sentencing guidelines, the software they came up with, they said, this is going to remove the bias and people will be treated fairly. Guess what. African-Americans were 20 percent more likely to get a harsher sentence. In some cases, they were 45 percent more likely to get a harsher sentence, with the exact same parameters, because the data set that they used to train this model was inherently biased. They did not recognize it or remove it, and now they build it and it is in 25 states, and it is sending people I know, my family members, to jail longer and giving them harsher sentences. This is real, this is happening, and this scares me. Because, at some point, it starts impacting us more than a self-driving car, or more than an election that we may not agree with. It is going to start making life-and-death decisions for you, it is going to start making decisions about your healthcare, it is going to start making decisions about your mortgage rates, it is going to make decisions that you do not understand, and the people that are deploying them do not understand. And, as usual, we, the public, will be left holding the bag. Because, after the mortgage crisis, no one went to jail. No one was held accountable, and we got the tax bill that we will continue to pay and your children will continue to pay. Really uplifting? Isn't it? (Laughter). So what can we do now? I mean, we can not talk about it. We can put our heads in the sand, you know, or we can start to have a discussion around where the data comes from. We can start having a discussion of, is the data over sampled, or under-sampled? We can start bringing in other people to look at the data, one of my favorites is that we can be transparent about what it is that we are collecting, and what it is we are using, and how we are using it. This is not a trade secret, it is data. Anybody can get it, and how you use it should not be a trade secret, particularly because it involves people, because it involves places, and it involves things in the public domain already. And that is kind of where I want to go next. Actionable steps: Seek people outside of your circle. You have to talk to other people. I want to, like, call out QCon for really taking a large step towards making this a diverse and inclusive conference. It is amazing, I'm seeing people in here -- yeah, give them a hand. They have done a great job. (Applause). And the fact that I'm up on this stage means they did a great job. Thank you. (Laughter). But, when you are creating these systems and deploying them, find people outside your circle. I know a few people who are doing people detection, they are making sure that they can identify people, and identify the right people, the wrong people, and I ask who is your data going out to? They showed me, these are all very wealthy tech people. These aren't people of color, these are not people from different backgrounds, or -- particularly in California, different body shapes and sizes, they are healthy people that you are doing people detection on, you have to widen your data set, you cannot rely on that and roll it out because it is going to have problems. I talked about radical transparency, it has to be radical, you have to put it out for people to see and it has to be peer-reviewed, if not, you are going to continue to build into your data sets your bias. My thing, hire more women in engineering. Do it! (Applause). Every engineering team I have worked with that have had more women in it have been better engineering teams. I just get better output, it is a fact. Get over it. Do it. If you still think it is an issue, come. I will stay around for that. (Laughter). If you want to talk to me about that, I have some friends I want you to talk to. And another thing, work on your empathy and your self-awareness. We can change the data, we can make it transparent, we can bring people in. But if we don't improve who and what we are, if we don't develop more empathy and self-awareness, we are just going to revert back to the mean. And that is something that I have seen every time. You know, just recently, we had, what's the guy's name, Jason Caldbeck is trying to repair his image, he was one of the VCs, and first of all, work, and second of all, work on empathy and self-awareness, meaning that you should not show up. You have tainted such a large pool that you have no business going out there. And this is what we like to do, though. We think that we can just go and programatically solve a problem. But sometimes the problem isn't out there, sometimes the problem is in here. Sometimes the problem is with us. And I really challenge all of you to, you know, to -- this is one of my favorite Obama statements, it is totally non-political, every day, he tries to ring a little bias out, that is a great quote for the systems we are building, ring a little bit of the bias out, you cannot know about all of it or most of it, but just a little bit every day. It is like re-factoring, you know, it is. It is like refactoring, nobody likes to do it, but it is really good. It -- you really end up in a better place because of it. Because refactor your empathy, refactor your self-awareness constantly and consistently. And so there are some sources here, and bias variance trade-off, check that out, it is on a lead science, it is a great read. And Algorithm Watch, and the algorithmic Justice League, which I think is really good timing. These are sources that you can all go to that will help you understand how to start attacking bias in your data sets. Europe, as always, is ahead of the United States when it comes to protecting people. They have a general protection regulations, it is amazing, go and read it, learn a lot. If you are not following that, I advise you to start doing it now. It might not make it to the United States, but it is the right thing to do. And Federica Pelzel, I sourced from her for this talk and she goes in-depth for a lot of what I'm talking about, there's a lot of people talking about this now, let's not make the same mistakes, let's not build a data ML weapon of mass destruction and deploy it on people unsuspectingly and stand up a year or two, or five years later, and say, we're just the platform. (Applause). Because nobody is buying that anymore. (Applause). Thank you. And, uh, this is -- this is going a little faster than I would thought. I thought I would talk slower, but the part that has concerned me is we generally have worked without a lot of accountability in tech. We have, for years, have been able to craft systems and platforms with little oversight. And I think that has been, for the most part, a good thing. Those times are changing, and they are actually coming to a close. I would rather be self-regulated than regulated by the government. I'm sure most of you in here would rather be self-regulated than regulated by the government. But you have to start leading today. Don't wait for these problems that we've seen with Twitter and Facebook and Reddit and the other, you know, platforms that have been co-opted by foreign governments to spread false information. The only way to do that is to start attacking this today, and to start attacking this in your data sets today. You might not be building the next Twitter or Facebook, but you are building the next Something. And I implore all of you to start thinking outward, to start thinking about the impact it has on people that don't look like you, people that don't come from your back grounds, people that don't come from your schools, people that don't come from your families. People who are less privaged than everyone in this room. It is so important, and I am spending time on that because I've spent 20 years in this industry, and I have watched it become a force of change in the world that today I'm more embarrassed than proud of. I'm more embarrassed that we built a system that was co-opted by a foreign government, systems that were co-opted by foreign governments that put information in front of us that was not true, that started essentially or amplified the racial animosity that has been brewing in this country for decades, amplified it in a way that nobody would have anticipated five years ago. I'm positive that Jack and Mark Zuckerberg were not thinking, we're going to build a system that does this. They never thought that. But as the systems grew beyond what they understood of it, and as they brought people that scaled those systems, they did not ask those questions and we cannot make that mistake again. (Applause). I always think I should have something pithy to say at the end, but I can't. This is a dark talk. You could say it is a black talk. (Laughter). I want to, one, thank you all for being here and listening to what I have to say. It is an honor to get in front of people, it is something that I look forward to because I think, when you can speak to your peers, and they listen to you, you can move the dial a little bit. So thank you for helping me move the dial. Thank you for showing up early in the morning, and thank you for laughing at my really bad jokes. (Applause). >> So we're going to go ahead and wrap, but Leslie will be up here if you want to ask him direct questions. Thanks, Leslie. >> Thank you. (Applause). . >> Thank you for attending today's sessions, today's keynote, the sessions will begin at 10:35.