OpenStack lends itself well to big data problems. With Swift and Ceph, data storage is easier than ever. One of the most consequential problems in the big data space is using AI to make sense of ever-increasing data volumes. OpenStack makes this a solvable problem: Data stored in Swift can be accessed by a Sahara cluster, which can use GPU instances to accelerate parallel AI hyperparameter tuning. This ability allows users to spin up and down huge AI training farms at a fraction of the manual effort, and in the end, isn't that what the cloud is all about?
Attendees will get an overview of:
- The architecture pattern that has emerged of using Spark to accelerate AI training in parallel
- Using OpenStack to build a Spark cluster to perform parallel AI training
- Using Sahara to access data from Swift to perform the training