Kubernetes is rapidly becoming the standard orchestration tool for declaratively managing open infrastructure. Over the last two years, we have been running baremetal Kubernetes clusters in production that are running challenging containerized workloads including OpenStack itself. We have upgraded these workloads and the Kubernetes infrastructure itself while maintaining these mission critical environments powering our 5G infrastructure. In this talk we will revisit some of the lessons learned in dealing various challenges along the way from upgrading Kubernetes and the unexpected fallouts that can occur when running complex workloads; docker stability and upgrades; CPU time stealing issues with real time workloads; CNI upgrades in running environments; debugging containerized neutron agents; and issues when workloads like OpenStack tap into functionality like hugepages, cpu pinning, and others that Kubernetes may not account for cleanly from release to release.
In this talk, you will learn:
- How kubernetes has changed the way we think about open-infrastructure.
- What the challenges are to running a complex Open Infrastructure workload like OpenStack on Kubernetes in production.
- The reality of Kubernetes upgrades when workloads use features like hugepages and cpu pinning.
- How we try and avoid cascading failure.
- How a containerized OpenStack changes the way you debug OpenStack in production.
- The pros and cons of a containerized everything.