Conference: Nov 13-15, 2017
Workshops: Nov 16-17, 2017
Presentation: Query Understanding: a Manifesto
Duration
Persona:
- Architect
- Data Scientist
- Developer
Key Takeaways
- Understand the importance of focusing on search queries to determine user intent.
- Gain deeper insights into search behaviors, such as: search suggestions, prediction, and query rewriting.
- Hear a list of quick wins you can walk away with that will increase the understanding and performance of search.
Abstract
Query understanding is about focusing less on the search results and more on the query itself. It's about figuring out what the searcher wants, rather than scoring and ranking results. Once you have established a query understanding mindset, your overall approach to search changes: you focus on query performance rather than ranking. In particular, you pay more attention to query suggestions, especially those generated through autocomplete.
In this talk, I'll show you what search looks like when viewed through a query understanding mindset. I'll focus on query performance prediction, query rewriting, and search suggestions. If you work on search problems, then come to this talk to discover opportunities for quick wins and longer-term investments in your search stack. Even if you don't work on search problems, seize this opportunity to gain a perspective on search that you won't find in an information retrieval textbook.
Interview
Daniel: Since leaving LinkedIn in mid-2015 (after 4.5 years there), I’ve been advising and consulting for a variety of companies. These companies range from early-stage startups to established public companies. My specialty is search and discovery (query understanding in particular), but I generally help them make decisions around algorithms, technology, product strategy, hiring, and organizational structure.
Daniel: Ever since my pioneering work on faceted search at Endeca, I’ve been an evangelist for a more query-centric approach to search. While I see value in developing better ranking algorithms to improve search relevance, I feel that as an industry we’ve overemphasized result ranking and neglected what we can do with the queries themselves. While Endeca focused on query refinement, my later work has emphasized the entire query lifecycle.
My argument is simple: instead of treating search relevance as a result-ranking problem, let’s focus on the query understanding problem. This view is somewhat contrarian in the information retrieval space. My talk is not just an exposition of query understanding techniques, it is also a manifesto to persuade people to change their philosophical approach to search relevance.
We often classify web search queries into navigational, informational, and transactional queries. E-commerce sites distinguish between specific product searches and category searches. Sites like LinkedIn or Facebook have name searches and searches based on people's characteristics.
So the domain certainly matters, but there are common themes. Let’s look at an example, like daniel carnegie mellon, a query we might see on a site like LinkedIn or Facebook.
The first step is to tokenize the query and segment the queries into entities, i.e., daniel, carnegie mellon. We then want to associate those entities with classes, i.e., First Name: Daniel, School: Carnegie Mellon. This leads you to another level of understanding where you can infer that the searcher is looking for a person whose first name is Daniel and who attended Carnegie Mellon.
The interpretation stack tends to have a lot of commonality across domains. But the particular query understanding challenges can be highly domain-dependent.
For example, how do you identify entities and their associated classes? Are your classes clearly distinct from one another, or do you have to worry about class overlap? Even when you correctly identify the entities in a query, could there still be ambiguity in the overall query intent?
How hard it is to address each of these questions is domain-dependent.
Daniel: Absolutely. If you think about where search was going before voice and conversational interfaces, it was heading in a direction where, instead of seeing full queries, we were seeing instant search suggestions and even instant search results. As Google said in its "10 things we know to be true", fast is better than slow. In fact, search suggestions do more than save time and reduce effort; they also guide the searcher to better queries.
But how do we apply the instant search suggestions in the context of natural language and voice? Do we give up on them and instead require the users to enter complete sentences before giving them feedback? To me, that feels like a huge step back, all in the name of providing a more natural interface.
At the same time, we know that machines interrupting searchers to complete their sentences is probably not going to work. It's like a Clippy 2.0.
So I worry that we’re making a big sacrifice just because we believe people prefer a voice interface.
I’m also curious how our interactions with a conversational search engine will compare to our interactions with one another. People pay attention to tone of voice. We perceive the nuances in everything ranging from how we speak to their accompanying body language (or even the look on someone’s face). We don't have that today with our voice-based applications, and it seems like we are not getting there with video yet, despite (in principle) the ability to do that. There's been some research at analyzing people’s faces to predict searcher frustration, but no practical application of this or related research as far as I know.
In general, I feel that we are in an uncanny valley, as far as the way machines interact with us. It’s hard to know at what point we will overcome it. I think that (choosing a different domain) it’s like Pixar's movies that have made animation truly on par with live action. We are not yet there with our machines trying to interact with us like people.
Daniel: Intermediate. It will be most useful to people -- particularly engineers, data scientists, and product managers -- who know something about search. But I’ll try to make the talk sufficiently self-contained to be useful to any technical generalist with an interest in search.
Daniel: I believe that anyone who is responsible for a search engine or search-based application will walk away with a list of quick wins to improve relevance through query rewriting, a better scoring function for search suggestions, etc. And I hope that I’ll have influenced their longer-term strategy for improving search relevance, thus enabling them to better prioritize their roadmap.
Similar Talks
.
Tracks
Monday Nov 7
-
Architectures You've Always Wondered About
You know the names. Now learn lessons from their architectures
-
Distributed Systems War Stories
“A distributed system is one in which the failure of a computer you didn't even know existed can render your own computer unusable.” - Lamport.
-
Containers Everywhere
State of the art in Container deployment, management, scheduling
-
Art of Relevancy and Recommendations
Lessons on the adoption of practical, real-world machine learning practices. AI & Deep learning explored.
-
Next Generation Web Standards, Frameworks, and Techniques
JavaScript, HTML5, WASM, and more... innovations targetting the browser
-
Optimize You
Keeping life in balance is a challenge. Learn lifehacks, tips, & techniques for success.
Tuesday Nov 8
-
Next Generation Microservices
What will microservices look like in 3 years? What if we could start over?
-
Java: Are You Ready for This?
Real world lessons & prepping for JDK9. Reactive code in Java today, Performance/Optimization, Where Unsafe is heading, & JVM compile interface.
-
Big Data Meets the Cloud
Overviews and lessons learned from companies that have implemented their Big Data use-cases in the Cloud
-
Evolving DevOps
Lessons/stories on optimizing the deployment pipeline
-
Software Engineering Softskills
Great engineers do more than code. Learn their secrets and level up.
-
Modern CS in the Real World
Applied, practical, & real-world dive into industry adoption of modern CS ideas
Wednesday Nov 9
-
Architecting for Failure
Your system will fail. Take control before it takes you with it.
-
Stream Processing
Stream Processing, Near-Real Time Processing
-
Bare Metal Performance
Native languages, kernel bypass, tooling - make the most of your hardware
-
Culture as a Differentiator
The why and how for building successful engineering cultures
-
//TODO: Security <-- fix this
Building security from the start. Stories, lessons, and innovations advancing the field of software security.
-
UX Reimagined
Bots, virtual reality, voice, and new thought processes around design. The track explores the current art of the possible in UX and lessons from early adoption.