The Resource Description Framework (RDF) is a standard model for storing graph data. While the standard was initially created for storing meta-data about the World Wide Web, its flexible format made it a popular choice for storing many different types of information. With the explosive increase in the size of available data, scalable solutions are needed to efficiently store and query very large RDF graphs within big data architectures. Apache Rya (incubating) is a scalable database management system designed for storing and searching very large RDF data. Rya is built on top of Apache Accumulo and also supports a MongoDB back end. nIn this talk, we introduce storage methods, primary and secondary indexing schemes, statistics based query optimization, as well as query evaluation techniques that allow Rya to scale to billions of triples across multiple nodes, while providing fast and easy access to the data through conventional query mechanisms such as SPARQL.
SPARQL at scale with Apache Rya Adina Crainiceanu
September 12, 2019