Openstack control services are usually deployed all together on dedicated high density hardware and using traditional clustering software to provide high availability. Swisscom took a different approach and, embracing modern distributed application architecture principles, untangled all the various components into separate and whenever possible horizontally-scalable entities. This way we reduced complexity of managing larger fail-over clusters, as well as eased deployment, maintenance and lifecycle. We will introduce our design, our reasoning behind the taken approach, as well as dive into some of the problems we encountered and talk about still existing roadblocks for a true cloud-native deployment of OpenStack management services.
Essentially, what we are able to show and discuss is an alternative design of how to manage your OpenStack services, which we verified in multiple production deployments. Although, the initial goal was to have a pacemaker-free deployment, we also hit various boundaries of the maturity, as well as the cloud-native nature of some of the components. Over the last year, we worked on the various components with these drawbacks and we will show how we improved the architecture for an overall better resilience by moving more and more towards fully active/active. We will introduce our design, our reasoning behind the taken approach, as well as dive into some of the problems we encountered and talk about still existing roadblocks for a true cloud-native deployment of OpenStack management services.