Two of the desirable features for private clouds are better control and predictable performance. Although public clouds have been extensively researched to characterize their unpredictable performance, private clouds have received less scrutiny.
In this talk, we will present how production workloads interfere with each other in an Openstack based cloud. We draw lessons from a several month long study of running workloads in different configurations on highly available implementation of Openstack. We study the impact of noisy neighbors on the network and storage IO performance of applications. We also look at the performance metrics of Openstack control plane and how the API calls are impacted with more number of entities like networks, routers, VMs, volumes. Our study relies on a tool that we developed to create clean and noisy workload deployments, using micro-benchmarks as well as enterprise workloads such as Hadoop, Jenkins and Redis.
Attendees will learn about the following three topics:
1) A novel tool that we developed to generate and collect workload metrics in an automated manner. The tool can put workloads across a set of hosts to evaluate different networking configurations such as VMs on same/different host and same/different subnet. Using the tool, we also created set ups with zero or more routers for VM traffic and different storage backends in terms of a local and shared storage system.
2) A performance evaluation study of various interesting configurations in terms of network topologies, CPU/memory over-commitment and storage topologies. The data is collected over several weeks and consists of thousands of data points for different configurations.
3) Based on this study, we show that there is quite a bit of interferance in a general Openstack based cloud. We did the root cause analysis of these problems and suggest various optimizations and best practices to fix these problems.