As OpenStack clusters grow larger, they hit scaling limitations in a number of components. To work around this problem, operators create separate clusters.
Let's discuss creating a "Large Scale SIG" that would tackle specifically those issues, and try to push back the size and activity limits within a given cluster. By sharing performance analysis, it can identify key bottlenecks. By pooling development resources, it can push fixes and new features to address those issues.
Multiple users report the very same kind of issues when running large clusters. If we could get them to share and work together, we could push back the size limits of a single cluster.