PostgreSQL fails to start on RHEL 6 sometimes

We were having issues with PostgreSQL startup and it was pretty undeterministic. For some reason, PostgreSQL failed to start from time to time:

[root@beta ~]# service postgresql start
Starting postgresql service:                               [  FAIL  ]

Apparently, when I tried to list processes, it was running. Something wrong is with the sysv init script in RHEL 6 or CentOS 6 (or other clones). Quickly checking start function I saw this:

$SU -l postgres -c "$PGENGINE/postmaster -p ...
sleep 2
pid=`head -n 1 "$PGDATA/" 2>/dev/null`
if [ "x$pid" != x ]; then
success "$PSQL_START"
failure "$PSQL_START"

I simplified the bit a bit, but you should see it. Start, wait two seconds, then try to locate the pid file. If it's not there, report error. Oops.

On some very slow systems or discs, you can hit this problem. Slow systems? I guess you are pretty sure all your systems are simply fast enough to handle this properly. Well, you may be wrong. In our case it was a virtual hypervisor with a high IO load and in some cases, postgresql startup was a little longer than two seconds. Just a liiiittle bit.

Good news is this is gonna be fixed very soon, the next Fedora version will probably have systemd startup configuration instead sysv init script. It should work fine then. For older versions, we have filed a BZ.

06 March 2012 | fedora | rhel | postgresql