2

I am bashing my head against the wall. its been 4 days.but psql is not connecting.

We have a small array of Greenplum database.In that, We have the master node. when i am trying to use psql utility

Getting this error :

[gpadmin@master gpseg-1]$ psql
psql: could not connect to server: No such file or directory
Is the server running locally and accepting
connections on Unix domain socket "/tmp/.s.PGSQL.5432"?

We tried

on searching for postmaster.pid files.

We have removed it.But still, error remains.

NEO
  • 389
  • 8
  • 31
  • Have you got multiple versions of PostgreSQL installed? It looks like it could be looking in the wrong directory for the UNIX sockets. I believe newer versions are stored in /var/run/postgresql. – d1ll1nger May 17 '16 at 13:01
  • We need more information on the system you are using. Is your daemon running? Try $ sudo service postgresql.service start or $ sudo systemctl start postgresql.service – n2o May 17 '16 at 13:04
  • PostgreSQL 8.2.15 (Greenplum Database 4.2.0 build 1) (HAWQ 1.3.0.2 build 14421) on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC) 4.4.2 compiled on May 18 2015 is the output of select version() command 17:03:43 (1 row) – NEO May 17 '16 at 13:04
  • it is greenplum database we have only single postgres instance for it – NEO May 17 '16 at 13:05
  • we are trying to use master node psql utility from gpadmin user. it throws the error. – NEO May 17 '16 at 13:05
  • How do i check on daemon process in greenplum database. Do we have any command for it. – NEO May 17 '16 at 13:06
  • Sounds like checking pg_hba.conf for adequate "local" permissions is warranted. – kbulgrien May 17 '16 at 13:09
  • What operating system? On Linux, use a command like 'netstat -ln | egrep '(postgres)|(5432)'. If nothing shows up, then the database is likely running on a different port number, and it has to be specified on the 'psql' command line. – kbulgrien May 17 '16 at 13:25
  • sorry for inconvienece because of internet cut off . i am responding late . here is link of pg_hba.conf http://paste.ofcode.org/396ntbgWYLaSFvNG4D5jyrz – NEO May 17 '16 at 16:35
  • Upon further consideration, pg_hba.conf is not really pertinent to the issue of psql not finding the /tmp/.s.PGSQL.5432 file. That said, pg_hba.conf implies that a vsingh user is implicitly trusted to access the socket. The postgresql.conf file settings are more likely to be related to this particular error message. – kbulgrien May 18 '16 at 14:08
  • start the DB first, then this command will available – Gurupreet Singh Bhatia May 26 '16 at 06:29

3 Answers3

2

Use a command like ( netstat -ln; ps -ef ) | egrep '(postgres)|(postmaster)|(5432)' to try to determine whether or not an instance of the postgres server is running.

If the postmaster is not running, remove the postmaster.pid file and restart the database. While I don't use the Greenplum database, I see that instructions are here: Starting and Stopping the Greenplum Database. Do not remove the postmaster.pid file without making sure the database is not running, and note that removing the postmaster.pid file without starting the database is pointless.

It may be wise to open your postgresql.conf file and see if the listen_addresses, port, unix_socket_directory, unix_socket_group, and unix_socket_permissions settings might be a source of issues.

Since the error message referenced specifically mentions the socket file, look most closely at unix_socket_directory, unix_socket_group, and unix_socket_permissions.

If unix_socket_directory is pointing somewhere other than /tmp, then various workarounds exist.

Alternatively, and presuming that the server is running, one might try to locate the socket file without looking in the postgresql.conf file, though this might make it a bit harder to address permissions, port, etc. issues. A tool like locate, find, etc., may be used in conjunction with sudo or by the root user.

$ sudo find /tmp /var -name .s.PGSQL.5432

Presuming that the location of the .s.PGSQL.5432 file issue is the root cause of your problem, specifying the socket file location on the psql command-line is probably the most straightforward workaround. In example, if the *.s.PGSQL.5432 file is in /var/pgsql_socket directory as it is on some systems, try this, but, of course, use the actual directory where .s.PGSQL.5432 is located:

$ pgsql -h /var/pgsql_socket

If the .s.PGSQL.5432 file IS in /tmp, then the problem is more likely one of permissions, and consulting the postgresql.conf file is advised, and probably the user attempting the psql command will have to be added to a group that has access to the socket file. (Remember, log out and back in after changing group membership.)

Though the page does not necessarily seem to directly relate to this issue, do consider the Accessing the Database help as needed.

kbulgrien
  • 4,384
  • 2
  • 26
  • 43
  • When i am debugging the issue i have given permission 777 to "/tmp/.s.PGSQL.5432" . Now my error got converted into psql: could not connect to server: Connection refused Is the server running locally and accepting connections on Unix domain socket "/tmp/.s.PGSQL.5432"? – NEO May 20 '16 at 04:12
  • we have observed one more thing. that is on each reboot /tmp/.s.PGSQL.5432 file vanishes . so again error saying [gpadmin@master gpseg-1]$ psql psql: could not connect to server: No such file or directory Is the server running locally and accepting connections on Unix domain socket "/tmp/.s.PGSQL.5432"? – NEO May 20 '16 at 04:43
  • In general, Postgres is permissions paranoid. If one sets permissions too loosely on some of its files, it will decline to allow connections. Though I have not seen this with the socket file explicitly, it is possible that this is the cause. You must examine the postgresql.conf file and correlate its settings with rights granted to the user attempting the psql command. Yes, the socket file will disappear after every boot and revert to the old permissions; the user attempting the psql command must have rights compatible with the settings specified in the postgresql.conf file. – kbulgrien May 20 '16 at 14:57
  • Does /var/log/postgres/postgresql contain any interesting / pertinent information about the "psql: could not connect to server: Connection refused Is the server running locally and accepting connections on Unix domain socket "/tmp/.s.PGSQL.5432" error? – kbulgrien May 20 '16 at 15:00
  • We solved the issue by rebooting the segment's machine. and then i ran gpstate utility it is showing UP everything . Thanks – NEO May 21 '16 at 04:36
  • can we talk on this issue. – NEO Oct 11 '16 at 18:12
0

What does gpstate show? If it cannot connect, make sure the GPDB master is running:

ps ax | grep 'M master'

If the master is running, it will also show the port the master is listening on.

For the gpadmin Linux account, look in the ~/gpAdminLogs directory. There should be one or more startup logs that you can check.

Gustavo Morales
  • 2,614
  • 9
  • 29
  • 37
csylvester
  • 36
  • 2
0

That error normally means that the database is stopped. You should never remove this file unless it was left behind after a bad crash, where the file was never cleaned out. You would normally detect that situation when you start the db again - it would complain the file already existed.

I tend to look for

ps -eaf|grep -i silent

to see the postmaster processes.

If the master is down, but the segments are up, you will need to start the master only

gpstart -m

then stop everything with

gpstop -M fast

Causes for failures should be in $MASTER_DATA_DIRECTORY/pg_log and possibly in the corresponding segment pg_log directories. There may also be core files if the master or segments had a panic.

Brian
  • 1
  • i have restored the same postmaster.pid as i have deleted it earlier.. But with the help of backup i restored the postmaster file in master. – NEO May 17 '16 at 16:35
  • On the master server, what does "ps -eaf | grep postgres" return. If nothing, then the master database is stopped. You will need to check the same on all the segment servers before starting the database again. If the master database is still running, but you cannot connect, you can try to stop it with "pg_ctl stop -D $MASTER_DATA_DIRECTORY -p 5432" – Brian May 18 '16 at 10:14