How Software Engineering Has Changed with Advent of OSS – Nupur Sharma
The talk shall explore the business of open source and how open source has changed the way software engineering is done and executed. Earlier, every software process was done as a ballpark project and designed with commercial, non-extensible products in mind.With the new open source paradigm, companies are now driving software development with open source products as the core and leveraging the extensibility of the product itself. In the talk, Nupur shall drive through the thought process of product designers through the 1990s, 2000s and now. Nupur shall explain how organisations are adapting Open Source Software and building their entire business models around them. Driving through some use cases, the transition from closed source to open source in many existing and well thought processes shall be discussed and explored. This shall enlighten any org exploring to move to OSS paradigm.
Deep Neural Network Regression at Scale in Spark MLlib – Jeremy Nixon
Deep Neural Network Regression at scale in Spark MLlib – Jeremy Nixon will focus on the engineering and applications of a new algorithm in MLlib. The presentation will focus on the methods the algorithm uses to automatically generate features to capture nonlinear structure in data, as well as the process by which it’s trained. Major aspects of that are the compositional transformations over the data, advantages of the various activation functions, the final linear layer, the cost function and training via backpropagation. Applications will look into how to use neural network regression to model data in computer vision, finance, and the environment. Details around optimal preprocessing, the type of structure that can be found, and managing its ability to generalize will inform developers looking to apply nonlinear modeling tools to problems that they face.
Geospatial Track: Geospatial Big Data: Software Architectures and the Role of APIs in Standardized Environments – Ingo Simonis, Open Geospatial Consortium (OGC)
A number of technologies have evolved around big data, in particular products from the Apache community such as Hadoop, Storm, Spark, Hive, or Cassandra. The geospatial community has developed a range of standards to handle geospatial data in an efficient way. Most of these standards are produced by the Open Geospatial Consortium (OGC) and implemented in the form of domain-agnostic data models and Web services. With the emerging demand for streamlined APIs, new questions emerge how access to Big Data in the geospatial community can be handled most efficiently, how existing standards serve these new demands and implementation realities with distributed Big Data repositories operated e.g. by the various space agencies. This presentation should stimulate the discussion of geospatial Big Data handling in standardized environments and explore the role of products from the Apache community.
This morning in Seville, we had two great keynote. Stephan Ewen talked about stream processing in Apache Flink, and Alan Gates talked about the Apache Way training program he’s instituted at Hortonworks.