1

I have potentially a corrupted Postgres database running in a docker container. This container mounts a data volume to /var/lib/postgresql/data.

I exec'd into the docker container and ran the command gosu postgres pg_resetxlog /var/lib/postgresql/data. The error I received is of the following:

pg_resetxlog: lock file "postmaster.pid" exists
Is a server running?  If not, delete the lock file and try again.

I tried 3 things:

  1. match the PID listed under postmaster.pid with my Postgres process and manually killing Postgres using kill <PID>. This did nothing to shut down Postgres.

  2. Delete the postmaster.pid located under /var/lib/postgresql/data. This forced the container to restart but the same issue persists.

  3. I ran docker restart <POSTGRES> to restart postgres.

None of the above did anything to help. What I'm trying to do is essentially have a way for this container to recover without completely destroying it and forcing it to start anew. I'm using Postgres:9.5 docker.

Any ideas?


EDIT: add container logs

Sep 11 18:23:34 VM postgres-container[1045]: LOG:  incomplete startup packet
Sep 11 18:24:36 VM postgres-container[1045]: LOG:  incomplete startup packet
Sep 11 18:24:46 VM postgres-container[1045]: LOG:  received smart shutdown request
Sep 11 18:24:46 VM postgres-container[1045]: LOG:  autovacuum launcher shutting down
Sep 11 18:24:58 VM postgres-container[1045]: FATAL:  the database system is shutting down
Sep 11 18:25:01 VM postgres-container[1045]: FATAL:  the database system is shutting down
Sep 11 18:25:39 VM postgres-container[1045]: LOG:  incomplete startup packet
Sep 11 18:25:39 VM postgres-container[1045]: FATAL:  the database system is shutting down
Sep 11 18:25:59 VM postgres-container[1045]: FATAL:  the database system is shutting down
Sep 11 18:26:02 VM postgres-container[1045]: FATAL:  the database system is shutting down
Sep 11 18:26:40 VM postgres-container[1045]: LOG:  incomplete startup packet
Sep 11 18:26:40 VM postgres-container[1045]: FATAL:  the database system is shutting down
Sep 11 18:27:00 VM postgres-container[1045]: FATAL:  the database system is shutting down
Sep 11 18:27:03 VM postgres-container[1045]: FATAL:  the database system is shutting down
Sep 11 18:27:41 VM postgres-container[1045]: LOG:  incomplete startup packet
Sep 11 18:27:41 VM postgres-container[1045]: FATAL:  the database system is shutting down
Sep 11 18:28:01 VM postgres-container[1045]: FATAL:  the database system is shutting down
Sep 11 18:28:04 VM postgres-container[1045]: FATAL:  the database system is shutting down
Sep 11 18:28:41 VM postgres-container[1045]: LOG:  could not open file "postmaster.pid": No such file or directory
Sep 11 18:28:41 VM postgres-container[1045]: LOG:  performing immediate shutdown because data directory lock file is invalid
Sep 11 18:28:41 VM postgres-container[1045]: LOG:  received immediate shutdown request
Sep 11 18:28:41 VM postgres-container[1045]: WARNING:  terminating connection because of crash of another server process
Sep 11 18:28:41 VM postgres-container[1045]: DETAIL:  The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
Sep 11 18:28:41 VM postgres-container[1045]: HINT:  In a moment you should be able to reconnect to the database and repeat your command.
Sep 11 18:28:41 VM postgres-container[1045]: LOG:  incomplete startup packet
Sep 11 18:28:43 VM postgres-container[1045]: LOG:  database system was interrupted; last known up at 2019-09-12 01:24:31 UTC
Sep 11 18:28:56 VM postgres-container[1045]: LOG:  database system was not properly shut down; automatic recovery in progress
Sep 11 18:28:56 VM postgres-container[1045]: LOG:  redo starts at 0/1C384970
Sep 11 18:28:56 VM postgres-container[1045]: LOG:  invalid record length at 0/1C44DAC8
Sep 11 18:28:56 VM postgres-container[1045]: LOG:  redo done at 0/1C44DAA0
Sep 11 18:28:56 VM postgres-container[1045]: LOG:  last completed transaction was at log time 2019-09-12 01:24:42.2848+00
Sep 11 18:28:57 VM postgres-container[1045]: LOG:  MultiXact member wraparound protections are now enabled
Sep 11 18:28:57 VM postgres-container[1045]: LOG:  database system is ready to accept connections
Sep 11 18:28:57 VM postgres-container[1045]: LOG:  autovacuum launcher started
Sep 11 18:29:42 VM postgres-container[1045]: LOG:  incomplete startup packet
Sep 11 18:30:45 VM postgres-container[1045]: LOG:  incomplete startup packet
Sep 11 18:31:47 VM postgres-container[1045]: LOG:  incomplete startup packet
bli00
  • 2,215
  • 2
  • 19
  • 46
  • Check please the log and write here the error – detzu Sep 12 '19 at 02:42
  • https://stackoverflow.com/questions/36436120/fatal-error-lock-file-postmaster-pid-already-exists – Adiii Sep 12 '19 at 02:44
  • Warning: `pg_resetxlog` will corrupt your database. – Laurenz Albe Sep 12 '19 at 02:55
  • @Laurenz Albe I know, this [post](https://stackoverflow.com/questions/598200/how-do-i-fix-postgres-so-it-will-start-after-an-abrupt-shutdown) suggest you can use `pg_dumpall` to restore the state. – bli00 Sep 12 '19 at 03:08
  • @Adiii I tried that but it doesn't work. Mainly I'm running this in a docker container and brew service isn't what's facilitating the Postgres process. I tried restarting the docker container but it nothing happened. – bli00 Sep 12 '19 at 03:09
  • what if you removed the pid file? if it just contain the process id? better to copy the whole direcotry to other location then try this? – Adiii Sep 12 '19 at 03:11
  • @detzu added the logs – bli00 Sep 12 '19 at 03:16
  • My comment on the 2nd thing I tried: "Delete the postmaster.pid located under /var/lib/postgresql/data. This forced the container to restart but the same issue persists." – bli00 Sep 12 '19 at 03:17
  • you should not run the container, keep the container stop or terminate , remove the file and then start the container. – Adiii Sep 12 '19 at 03:38
  • https://stackoverflow.com/questions/36436120/fatal-error-lock-file-postmaster-pid-already-exists you try this if you up the container like `docker run -it -v ... your_image bash` it will not start the container but will keep the container up and running and then iside the container will be able to play with psql – Adiii Sep 12 '19 at 03:39
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/199366/discussion-between-adiii-and-thestateofmay). – Adiii Sep 12 '19 at 11:05

1 Answers1

0

Good News thestateofmay !!! The server is up and running :)

You can connect to the container , use

docker exec -ti postgres /bin/bash 

you should be user root ,switch to postgres

su postgres

then start

psql

and check if the data is present in the server

detzu
  • 701
  • 6
  • 12
  • Data is not present in the server, I have confirmed that. The client that's connecting to this Postgres container is also failing, citing bad database state. – bli00 Sep 12 '19 at 03:56
  • copy here the error the client gets , and how is he trying to connect. Try too the steps I wrote up – detzu Sep 12 '19 at 04:16