electronic brain surgery since 2001

Server Problems

splitbrain.org has some problems currently. The sites weren't available for some time today. I currently have no SSH access and the MySQL database seems to be down. I'm not sure if mails can reach me, so please don't mail me ;-)


Everything seems to be back to normal now. From what I can see in the logs, the kernel VM started to kill processes yesterday at about 15:30 using the OOM Killer feature of the kernel. I don't know what it caused yet. From what I can see in my graphical statistics, memory usage was fairly “normal” just before the server stopped responding. But it might just be one more sign of the need to upgrade the server. BTW: is there a way to tell the kernel to reboot instead of killing (random) processes?

Main problem was, that after enough processes (or the right troublesome one) was killed, the Apache worked again, but MySQL and SSH was down. I'm not sure about the mailserver, yet. If you sent an important mail yesterday afternoon or today, just send it again.

I'm also thinking about a low resource and simple way to reboot the server when SSH is down. Something like attaching the reboot command to a network port in inetd and opening access to this port for a single trusted and fixed IP only. Sugestions welcome.

downtime, splitbrain, sitenews
Similar posts: