Troubleshooting OpenStack is difficult. In a real-world OpenStack system, configurations, parameters and architectures are all different in every OpenStack deployment. So, regardless of OpenStack maturity, we've found multiple problems in our OpenStack system, which often come out as error logs but sometimes not. In the latter case, problems are especially difficult to detect and fix.
n this presentation, we will introduce how to manage this problem by utilizing the Elastic Stack, not just parsing error log messages but taking more intelligent approaches. To detect invisible errors and problems, such as performance degradation caused by Keystone, DB and RabbitMQ, we have configured the Elastic Stack to collect log data in a way appropriate for analysis, extracting plenty of important information, e.g. request IDs, request URLs, response times of WSGI servers. We will show how to utilize these data for troubleshooting OpenStack and results of log data analysis in our OpenStack system.