Conference: Nov 13-15, 2017
Workshops: Nov 16-17, 2017
Presentation: Iterative Design for Data Science Projects
Duration
Persona:
- Data Scientist
Key Takeaways
- Hear how Datascope uses human centered design and an iterative design process to produce actionable insights.
- Understand how Datascope built an Expert Finder application for a Fortune 50 company using design methodologies for both their user interfaces and algorithms.
- Learn how the iterative design process puts business problems first and pivots as needs change.
Abstract
When solving problems, data scientists often start from the data, run analyses, and then almost as an afterthought, think about presenting results to stakeholders. This rigid, linear approach often fails to produce useful results.
At Datascope, we adopt methodologies from the design community, iteratively improving our work to ensure that our deliverable is useful to our clients. In building an Expert Finder application for a Fortune 50 company, we adopted such methodologies not only for the user interfaces, but also for our “expert finding” algorithms and data sources.
During this talk, I will go over how we iterated on the major pieces of this project to produce actionable insights and recommendations.
Interview
Bo: I am a partner and data scientist at a data science consulting company out of Chicago. We are called Datascope. Basically, if a company has some amount of data and they aren’t quite sure to do with it, or they do have an idea what to do with it but requires more bandwidth, we are a data science team for hire. As both a data scientist and partner, I do actual data science work and am also involved in business development, outreach, speaking at conferences, etc.
Bo: We try to be as true agnostic as possible. We really believe that the business problem should drive what we end up using. With that said, we have a very strong preference for open source. We don’t believe that if a company comes to us wanting some analysis done that we should sell them other proprietary stuff to implement our model. Pretty much everything to date is open source.
We use a lot of Python on the backend. On the frontend, the usual HTML and JavaScript. For interactive visualizations, we use open source JavaScript libraries. In terms of Cloud storage and infrastructure, it's pretty client dependant. Some clients prefer AWS or RackSpace was really popular although less popular now with our clients. We are also seeing an up-pick in Google Cloud services. It really it depends on the client and what makes sense for them.
Bo: We recently completed a project for a large company. At the time, it was one of our larger projects and it was a big learning experience for us. One of our distinguishing features that separates us from the competition is our iterative design process. Starting with the business problem and then seeing what is possible with data (from the client or from outside the client).
We use this process to make sure that, every step of the way, we are getting it right. As opposed to starting a long PRD which makes it more difficult to pivot. What we see a lot of in-house data science teams do is they start with the data first and then try to dig up stuff. That is also often a rather un-directed and somewhat less efficient approach.
With this project, the customer came to us with a very hazy idea of what they wanted to do. So we started with the business problem first, and we brainstormed. We came up with several mock prototypes first. Then, right off the bat, we had a bunch of great ideas (at least, we thought we did at the time) for data science applications with their data. But the client eliminated several of them right away. If we didn’t use the iterative design process, we would have probably tried to build out actual code and taken weeks to do something.
In using this design process, we were able to zero in on what makes sense from the beginning. So we lost as little time as possible. As we kept doing this iterative design process, essentially asking for feedback on a regular basis and constantly communicating rather than deploying every couple weeks, it became clear to us halfway through the project that we needed to pivot a bit. What we had both originally thought would be the perfect solution turned out to be missing something. The whole idea is to use this iterative design process rather than take a more traditional linear approach. It was a bumpy road. But, at the end of the day, we were able to deliver something that was useful to them and drove a lot of business value.
I would say that it is. I don’t think there is a better, more established term in the software community because a lot of people are very familiar with agile project management. I know that there are very particular parts of agile that are not in here. For example, we don’t have a proper ScrumMaster. What we call it instead is the design process. At the end of the day, it is a very similar concept. It is designed for data science which is not very different from agile.
Bo: I will answer this two fold. First, design is a process, and this is the process that we implement. This is precisely the idea. Second, with this particular project and for almost all of our bigger scope projects, a key component is the end user dashboard or some interactive visualization.
So when I say that we try to deliver and ask them for feedback from the very beginning, the first prototypes that we gave them were sketches of some type of interface. Sketches of a network diagram, sketches of some interactive bar charts, sketches of custom internal searches. Totally low fidelity; pencil and paper. It took us five minutes.
In every project, and in every data science project, I think it’s really important to first think about what the business problem is that you are trying to solve. Secondly, once you have said answers or deliverables (or whatever the end product is) for your data sciences project, how is it going to be digested and used? This project, for example, it was going to be digested and used by people that had a technical background, but were not data scientists. It was necessary to develop some type of end web app and that requires some user interface design.
Bo: For those in a more senior role (where it’s important for them to help manage their team in terms of figuring out what to work on at any given point), one key takeaway will be to always keep in mind what the goal is and knowing that the client might change it.
Just because you have a clear perspective and plan of what to do today, that doesn’t mean that next week your team will find something (or your client will have) another idea. The whole idea is to have a big picture idea but be willing to and account for the very real possibility that your project might pivot over time.
Similar Talks
.
Tracks
Monday Nov 7
-
Architectures You've Always Wondered About
You know the names. Now learn lessons from their architectures
-
Distributed Systems War Stories
“A distributed system is one in which the failure of a computer you didn't even know existed can render your own computer unusable.” - Lamport.
-
Containers Everywhere
State of the art in Container deployment, management, scheduling
-
Art of Relevancy and Recommendations
Lessons on the adoption of practical, real-world machine learning practices. AI & Deep learning explored.
-
Next Generation Web Standards, Frameworks, and Techniques
JavaScript, HTML5, WASM, and more... innovations targetting the browser
-
Optimize You
Keeping life in balance is a challenge. Learn lifehacks, tips, & techniques for success.
Tuesday Nov 8
-
Next Generation Microservices
What will microservices look like in 3 years? What if we could start over?
-
Java: Are You Ready for This?
Real world lessons & prepping for JDK9. Reactive code in Java today, Performance/Optimization, Where Unsafe is heading, & JVM compile interface.
-
Big Data Meets the Cloud
Overviews and lessons learned from companies that have implemented their Big Data use-cases in the Cloud
-
Evolving DevOps
Lessons/stories on optimizing the deployment pipeline
-
Software Engineering Softskills
Great engineers do more than code. Learn their secrets and level up.
-
Modern CS in the Real World
Applied, practical, & real-world dive into industry adoption of modern CS ideas
Wednesday Nov 9
-
Architecting for Failure
Your system will fail. Take control before it takes you with it.
-
Stream Processing
Stream Processing, Near-Real Time Processing
-
Bare Metal Performance
Native languages, kernel bypass, tooling - make the most of your hardware
-
Culture as a Differentiator
The why and how for building successful engineering cultures
-
//TODO: Security <-- fix this
Building security from the start. Stories, lessons, and innovations advancing the field of software security.
-
UX Reimagined
Bots, virtual reality, voice, and new thought processes around design. The track explores the current art of the possible in UX and lessons from early adoption.