|
<<< Previous speaker
|
next speaker >>>
|
Jason Hunter, Principal Technologist with MarkLogic
Jason Hunter is Principal Technologist with MarkLogic, and the father of
MarkMail.org. He's the author of "Java Servlet Programming" (O'Reilly Media) and
the creator of the JDOM open source project for Java-optimized XML manipulation.
He's also an Apache Software Foundation Member and former Vice-President, and as
Apache's representative to the Java Community Process Executive Committee he
established a landmark agreement for open source Java.
He's an original
contributor to Apache Tomcat, a member of the expert groups responsible for
Servlet, JSP, JAXP, and XQJ API development, and was recently appointed a Java
Champion. He's also a frequent speaker. His largest audience was 15,000 at a
JavaOne conference keynote.
|
Presentation: "Unifying the Search Engine and NoSQL DBMS with a Universal Index"
Time:
Wednesday 12:05 - 13:05
Location:
Concordia Room
Abstract:
In contrast to single-function architectures, MarkLogic Server takes an
unusual approach to collapsing the usual hierarchies of types of servers
that make up a complete application, combining Search, a NoSQL DBMS, and an
application server in a single kernel. The computational foundation for
this hybrid is the Universal Index.
In this talk, we'll begin with the familiar text indexing data structures
and algorithms that underlie search engine technologies. We'll extend that
index to cover document structure and semantics, add scalar range indexing
in one and two dimensions (including geospatial application), and then
incorporate "reverse" indexing of queries. We will demonstrate a novel type
of "matchmaking" query whose evaluation is based on a composition of
forward and reverse index evaluation. Finally, we'll explore the means by
which all of this indexing may efficiently run concurrently with querying,
using Multi-Version Concurrency Control and Log-Structured Merge Trees,
providing ACID transactions together with lock-free query evaluation,
built-in sharding, terabyte-per-server scale-out, replication, and query
distribution.
|
|
|