Wednesday 13th September 2017

Website (Production) Database outage

We received an alert from our monitoring system for the high CPU load on sloth (the server) at 20:11 UTC. We instantly connected to the server to discover that the database (mysqld) was taking 366% of our 400%CPU.

Having never really dealt with this kind of issues, we tried to get the most information we could on the state of the process, and then restarted it.

We will be keeping an eye on the machine, and try to make sense of the logs we got.