Lucene And Solr Document Classification – Alessandro Benedetti
This presentation will start by introducing how Apache Lucene can be used to classify documents using data structures that already exist in your index instead of having to generate and supply external training sets.
Building on the introduction the focus will be on extensions of the Lucene Classification module that come in Lucene 6.0 and the Lucene Classification module’s incorporation in to Solr 6.1. These extensions will allow you to classify at a document level with individual field weighting, numeric field support, lat/lon fields etc.
The Solr ClassificationUpdateProcessor will be explored, such as how it works, and how to use it including basic and advanced features like multi class support and classification context filtering.
The presentation will include practical examples and real world use cases.