Apache Big Data Seville 2016 – The Original Vision of Nutch, 14 Years Later: Building an Open Source Search Engine – Sylvain Zimmer

The Original Vision of Nutch, 14 Years Later: Building an Open Source Search Engine – Sylvain Zimmer

Few people remember that before spinning off Hadoop and focusing on crawling, Nutch was meant to be an alternative to commercial search engines. What if we tried to do it again today?

In this presentation, Sylvain Zimmer will explain how he used projects from the Nutch diaspora like Spark and Elasticsearch to build Common Search, an open source search engine with transparent rankings.

We will go over the architecture of large-scale search engines and how it has evolved since the late 90s. Then we will review the tools from the Apache and open source ecosystems that are best suited to solve the many challenges at hand. Finally, we will discuss what lies ahead for Common Search before it can be useful to the general public.

More information about this talk

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s