Multi-Tenant Machine Learning with Apache Aurora and Apache Mesos – Stephan Erb
Data scientists care about statistics and fast iteration cycles for their experiments. They should not be concerned with technicalities like hardware failures, tenant isolation, or low cluster utilization. In order to shield its data scientists from these matters, Blue Yonder is using Apache Aurora.
When adopting Aurora, our goal was to run multiple machine learning projects on the same physical cluster. This talk will go into details of this adoption process and highlight key engineering decisions we have made. Particular focus will reside on the multi-tenancy and oversubscription features of Apache Aurora and Apache Mesos, its underlying resource manager.
Audience members will learn about the fundamentals of both Apache projects and how those can be assembled into a capable machine learning platform.