Apache2 runs fine for a while, then stops serving content, error when restarting

Question

My system

Apache2 runing on Debian 7 Wheezy
It's a physical server with one IPv4 address and multiple vhosts.
Webapps: Polaric, Redmine, phpMyAdmin, etc
PHP-info here: http://tracking1.sfrkh.net/phpinfo/ (will be removed when problem is resolved)

Problem Description

When my server boots, everything works OK. Apache2 starts serving content, and my websites are working as expected.

After 'some time' (a few days to a couple of weeks), I can no longer access any of my websites. Apache2 stops serving content to my browser.

This is a recurring problem I've had for a few months. First time I experienced it, was a few days after installing and configuring everything.

Problem solving 1

If I reboot, everything is back to normal. The problem appears again after 'some time'.

Problem solving 2

First, when I try to start or restart apache2:

# apache2ctl start 
- OR -
# /etc/init.d/apache2 start
(98)Address already in use: make_sock: could not bind to address [::]:80
(98)Address already in use: make_sock: could not bind to address 0.0.0.0:80
no listening sockets available, shutting down
Unable to open logs
Action 'start' failed.
The Apache error log may have more information.

Then I check for listening sockets with netstat:

# netstat -ltnp | grep ':80'
tcp  0  0 0.0.0.0:8081    0.0.0.0:*       LISTEN      16100/jsvc.exec
tcp6 0  0 :::80           :::*            LISTEN      14794/apache2

Then I try to stop apache2:

# apache2ctl stop
httpd (pid 9124?) not running

- or -

# /etc/init.d/apache2 stop
Stopping web server: apache2.

Then I run the netstat command again, I got the exact same result as above. Not even the PID changed.

When I kill the PID from netstat and start apache2 again:

# kill -9 14794
# apache2ctl start

..then everything is back to normal.

Netstat after everything is back to normal:

# netstat -ltnp | grep ':80'
tcp  0  0 0.0.0.0:8081    0.0.0.0:*       LISTEN      16100/jsvc.exec
tcp6 0  0 :::80           :::*            LISTEN      16434/apache2

Netstat while the problem exist, and netstat after everything works, seem no different to me. Only the PID has changed.

The problem appears again after 'some time'.

Question

I don't know where to go from here. Tried searching google, these forums, other forums, but can't find a solution that works for me. As you can see, I get the server back up and running, but the problem appears again and again.

Any ideas of what could be causing this?

Note

And I hope I've turned to the right forum. Google is my friend, and most of the useful advice on similar topics, comes from this forum :)

Thanks in advance for any help!

Nice well formed question, pity I don't have the answer for you, but have you checked out these articles? http://ubuntuforums.org/showthread.php?t=1636667 http://stackoverflow.com/questions/10745878/ubuntu-error-with-apache-98address-already-in-use http://www.cyberciti.biz/faq/apachehttpdaddress-already-in-use-make_sock-could-not-bind-to-port-80-or-443/ http://www.who.is.free.fr/wiki/doku.php?id=apache#make_sockcould_not_bind_to_address_443 Hopefully one of these 4 can help lead to the answer. — The Humble Rat, Feb 18 '14 at 10:04
Thanks for trying :) Ubuntuforums: is not applicable I think. SSL/rsa-key isn't in use on my server. stackoverflow: Yes, I've read it. It works, but the problem appears again after some time. cyberciti: httpd not running. whoisfree: not using https (ssl/rsa-key). But thanks for the effort! — myklebost, Feb 18 '14 at 10:41
Do you have logs from before the reboots? Can you run memory statistics (e.g. `atopd`) to collect some forensic information? — tripleee, Feb 18 '14 at 17:47
Will do it as soon as the server fails again. Shouldn't be too long now. — myklebost, Feb 25 '14 at 07:58

score 0 · Answer 1 · answered Feb 19 '14 at 23:27

The Apache error log may have more information.

This looks like the place to start.

Looks like there are two problems here: 1) apache fails after a while, and 2) you can't restart Apache when that happens.

Problem 2 first. One possibility is that you have two Apache installs running, and one is grabbing the ports off the other.

Alternatively, Apache is dying, restarting, but not writing its PID files properly, so when you ask it to restart, it can't kill itself properly.

Alternatively, it's writing the PID files just fine, but apache2ctl (or the user you're running it as) doesn't have read access to them (less likely).

Either way, the PID files seem wrong, given that it's looking for PID 9124, but Apache is running on PID 14794. And given the "unable to open logs", that looks like it's a permissions thing. Try restarting Apache as someone with read/write access to the logs and pidfiles, like sudo apachectl graceful, or doing it as the apache user.

If that works, then as a workaround, you could try periodically running apache2ctl graceful through root or apache's cron - but that's a nasty, nasty kludge.

To get even nastier, something like this would make damn sure it died and restarted: apache2ctl graceful || (pkill apache2 && apachectl start)

The real thing to do though, is find why it's failing - problem 1.

My bet is that it's a permissions problem on the PID files again: that is, say in httpd.conf, you've asked it to restart the connection handlers after a certain number of connections (MaxRequestsPerChild setting), but when they restart they can't write to the pid files to update them. That could cause this. So would lots of other things, though.

Unfortunately, debugging that would take checking of the file permissions on your PID folders and all folders above that; the process ids and user that apache is running as (ps -ef | grep -Pi "apache|http"); the contents of the pid files; the user you're running apache2ctl as; the contents of your httpd.conf; and the contents of your syslog.

Which is more than we could debug here, but maybe if you search through those things for relevant-looking things, and post them?