40

I am on OS X 10.11.6 and trying to run a program that normally listens on UDP port 8008 upon startup.

This program normally also spawns a couple of helper child processes during its operation, but the port is bound by the parent process.

Unfortunately when exiting the program, sometimes the port remains open, even though the program (parent + children) no longer exist.

When this happens, if I try to run the program again it naturally fails with a EADDRINUSE error, and in these cases no matter what I try, the only solution I found was to reboot the machine.

I'm having a hard time believing that I cannot release the port without a reboot.

Here are some diagnostics that I ran so far (I ran all of these with and without sudo):

Find the process using port 8008 with lsof:

$ lsof -i -n -P | grep UDP | grep 8008

But surprisingly doesn't return any results.

However, I had more luck with netstat:

$ netstat -tulnvp udp | grep 8008
udp4  0  0  *.8008    *.*    196724   9216  47205   0

So, the port is indeed bound, and the culprit is pid 47205, however:

$ ps aux | grep 47205

Doesn't return anything. The same thing for PIDs 47206 and 47207 (most certainly the PIDs assigned to the children). I also tried other variations of the grep (program name, path, etc).

I also looked for any process reporting 47205 as its parent:

$ ps -axo pid,ppid,command | grep 47205

So the children processes are also clearly dead.

Not being able to kill anything, I tried to SIGHUP launchd in the hope that it might remove any zombie child processes:

$ sudo kill HUP 1
$ sudo kill -s HUP 1

But alas, netstat still shows the port bound.

Lastly, I tried to restart the loopback interface:

$ sudo ifconfig lo down
$ sudo ifconfig lo up

But again, to no effect.

I have waited several hours since the program last ran, so I'm pretty sure any timeout would have happened by now, but the port just won't get released.

Any ideas on how to force release the port without a reboot?

Edit:

  • The program in question is the electron-wrapped Patchwork.
  • This question originates from this github issue.
  • Although finding a solution/bugfix that prevents the issue from ocurring in the first place would be ideal, I'm also interested in ways to manually close that port from the terminal
ktorn
  • 1,032
  • 8
  • 14
  • Have this exact problem with something completely different (a Go and C++ based Ethereum blockchain application called geth and eth). Sometimes when quitting the app, I'm left with UDP port 30303. Nothing I've tried releases the port other than rebooting. – wojo Mar 15 '17 at 00:31
  • I have the same problem, which has happened to me a few times during my Ctrl-C, make, run development cycle. It does not always happen for me, though. I suspect it may occur only if I (rebuild and) run the process again too soon after killing it. Perhaps this is catching the OS in a bad state where it hasn't finished cleaning up the sockets from the previous run. – Josh Apr 03 '17 at 19:34
  • 1
    You should be aware that the port is only closed by the OS about 2 minutes after the port was closed by the process(es)... this is to prevent "old" packets still traveling the wire from being associated with a new port. – Myst Apr 26 '17 at 10:23
  • 1
    @Myst which is why I mentioned "I have waited several hours..." – ktorn Apr 28 '17 at 09:50
  • I have the same problem with a plain C app that I am working on, and it happens every time I run the app. Setting SO_REUSEADDR means that I can restart the app and the new instance is able to bind to the UDP port again, but the old socket is still open (according to `netstat`) and I get another socket every time I restart the app. Did you ever find a solution to this problem? – Tamás Jun 27 '17 at 08:48
  • @lidaobing *Looking for an answer drawing from credible and/or official sources.* There is no such prove, because it's not a OS X or js issue. See my answer for problem investigation. – Daniel Jan 15 '18 at 16:16
  • 1
    Also running into this issue with a UDP socket in node. I've tried almost all the answers, the process does not exist, but the port hangs open. Does not show up in `lsof` but does in `netstat` (which has cryptic state hex values, that I cannot find any documentation on). – Evan Purkhiser Jun 08 '20 at 08:01

5 Answers5

2

In your code, after you create the socket, but before the bind call, invoke the following:

int val = 1;
setsockopt(sock, SOL_SOCKET, SO_REUSEADDR, &val, sizeof(val));

Then call bind. The above will allow the socket bind to succeed even if the port is in use.

Two processes, attempting a recvfrom on the same port, will result in one of the processes receiving the packet, but not the other. And it's not deterministic which one will. So make sure you don't actually have two processes legitimately running and sharing the port.

selbie
  • 100,020
  • 15
  • 103
  • 173
  • it's not my own code, but it's open source and I just added details of the project now. one of the devs just mentioned that they already tried SO_REUSEADDR (or the equivalent in node.js) but did not work. – ktorn Nov 10 '16 at 00:48
0

one related question: mac changed the Behavior of SO_REUSEADDR and SO_REUSEPORT:

Behavior of SO_REUSEADDR and SO_REUSEPORT changed?

and I am the maintainer of iptux[1], if I use SO_REUSEPORT, the program can start, but I can't receive msg from this port, all the message go to unclosed port as a black hole.

[1] https://github.com/iptux-src/iptux

lidaobing
  • 1,005
  • 13
  • 26
-1

It is indeed possible to close the port manually w/o restarting the machine. On various linux flavors this is usually done w/ GDB by issuing syscalls masquerading as a the process (for example close(fd) syscall on the sockets file descriptor).

The process for that:

  • Open a UDP port: netcat -u 127.0.0.1 33333.
  • Check the UDP port: netstat -npu (u for UDP), which will give you the PID that occupies that port.
  • Run: lsof -np $pid for that PID to get the filedescriptor for the socket.
  • Then run GDB for that PID: sudo gdb -p 73599
  • When inside GDB run call close(file_descriptor)

Example:

COMMAND   PID  USER   FD   TYPE   DEVICE SIZE/OFF     NODE NAME
netcat  73599 ubunt  cwd    DIR  259,2     4096 13895497 /home/ubunt/Downloads
netcat  73599 ubunt  rtd    DIR  259,2     4096        2 /
netcat  73599 ubunt  txt    REG  259,2    31248 28835938 /bin/nc.openbsd
netcat  73599 ubunt  mem    REG  259,2    47600 23990813 /lib/x86_64-linux-gnu/libnss_files-2.23.so
netcat  73599 ubunt  mem    REG  259,2  1868984 23990714 /lib/x86_64-linux-gnu/libc-2.23.so
netcat  73599 ubunt  mem    REG  259,2   101200 23990866 /lib/x86_64-linux-gnu/libresolv-2.23.so
netcat  73599 ubunt  mem    REG  259,2    81040 23990710 /lib/x86_64-linux-gnu/libbsd.so.0.8.2
netcat  73599 ubunt  mem    REG  259,2   162632 23990686 /lib/x86_64-linux-gnu/ld-2.23.so
netcat  73599 ubunt    0u   CHR 136,19      0t0       22 /dev/pts/19
netcat  73599 ubunt    1u   CHR 136,19      0t0       22 /dev/pts/19
netcat  73599 ubunt    2u   CHR 136,19      0t0       22 /dev/pts/19
netcat  73599 ubunt    3u  IPv4 22142418    0t0      UDP 127.0.0.1:45255->127.0.0.1:33333

Then GDB:

$sudo gdb -p 73599
...
(gdb) call close(3u)
$1 = 0

You will see that the port is no longer there:

ubunt@ubunt-MS-7A94:~$ lsof -np 73599
COMMAND   PID  USER   FD   TYPE DEVICE SIZE/OFF     NODE NAME
netcat  73599 ubunt  cwd    DIR  259,2     4096 13895497 /home/ubunt/Downloads
netcat  73599 ubunt  rtd    DIR  259,2     4096        2 /
netcat  73599 ubunt  txt    REG  259,2    31248 28835938 /bin/nc.openbsd
netcat  73599 ubunt  mem    REG  259,2    47600 23990813 /lib/x86_64-linux-gnu/libnss_files-2.23.so
netcat  73599 ubunt  mem    REG  259,2  1868984 23990714 /lib/x86_64-linux-gnu/libc-2.23.so
netcat  73599 ubunt  mem    REG  259,2   101200 23990866 /lib/x86_64-linux-gnu/libresolv-2.23.so
netcat  73599 ubunt  mem    REG  259,2    81040 23990710 /lib/x86_64-linux-gnu/libbsd.so.0.8.2
netcat  73599 ubunt  mem    REG  259,2   162632 23990686 /lib/x86_64-linux-gnu/ld-2.23.so
netcat  73599 ubunt    0u   CHR 136,19      0t0       22 /dev/pts/19
netcat  73599 ubunt    1u   CHR 136,19      0t0       22 /dev/pts/19
netcat  73599 ubunt    2u   CHR 136,19      0t0       22 /dev/pts/19

GDB is available for MacOS, so it should work for your case as well.

Mindaugas Bernatavičius
  • 3,757
  • 4
  • 31
  • 58
  • This works, just use `lldb` instead of `gdb` in the latest OSX (with XCode installed I guess) and a tiny tweak to the `call close` command (needs to be cast to an int). If I have time I'll write up an answer with the exact commands to run on OSX. Thanks! – ktorn Aug 29 '18 at 05:35
  • Just to complete @ktorn comment – in `lldb` you have to use `call (int)close(3u)` – klob Sep 28 '18 at 06:31
  • 1
    This does not work as `lldb` just (correctly) informs me that the process does not exist. – Evan Purkhiser Jun 08 '20 at 08:02
  • 2
    This answer is less helpful than it could be. It closes the netcat port, not the stuck process. Trying to gdb to the stuck process doesn't work, and the original listen socket is still open. – Carl Mastrangelo Jan 25 '21 at 22:23
-2

System may keep socket open till I/O process still in progress. Even when the process died but not explicitly closed the socket. If your socket not closed at hours most probably you're missing something. Try to use low-level kernel investigation instead of top-level utilites like netstat or lsof.

Disclaimer

I'm not OS X expert, and most commands for linux. I still leave it there if someone else will have same problem.

1. Try to see if socket still alive (optional)

I may suggest to check socket communication.

 tcpdump -A -s0 port 8080  and tcpdump -A -s0 -ilo port 8080

If you see any data transfered over socket you can be sure the process active. Or may be one of its childs. Later you can catch the pid with strace

2. Check the process and its status

Linux have wonderful procfs. You can get so many things from there. And sure you can see all opened file descriptors

ls -al  /proc/47205/fd

If you see output and /proc/47205 exists the pid not released nevertheless ps shows. You will see all opened files and its fds.It looks like

133 -> socket:[32242509]

Where 133 is a fd number

Unfortunately OS X don't have /proc filesystem. The alternative command I found.

procexp 47205 fds

But I'm not sure its 100% working.

3. Closing the file descriptor (socket) in another process

In linux there is nice command

fuser -k -n udp 8080

This will explicitly close all processes blocking port. It seems OS X may have fuser too

Another real hackers way is to connect to process with gdb and run commands inside the process, because file descriptor numbers valid only with in process environment,exactly as @Mindaugas Bernatavičius wrote:

gdb -p 47205
>call shutdown([fd_number],2)
>call close([fd_number])

There is third way, when possible you can just restart whole network. Plese note, down and up just loopback interface is not enough. In linux run

systemctl restart network  

4.What to do to prevent socket stuck in system

You should always ensure socked closed before your program exits. I seen many issues with nodejs that sockets stays opened. Calling Socket.destroy() will solve the problem

May be put your socket destroy code here, before exiting the app:

app.on('close', function (code) {

// User closed the app. Kill the host process.

process.exit();

});

Daniel
  • 78
  • 4
  • -1 because the answer is 100% useless for macOS, which is what the question is about. Even `fuser` works differently. Furthermore, the OP has already shown in multiple ways that *there is no process*, which is what makes the situation puzzling; but the answer keeps assuming that there is one. – hmijail Nov 22 '20 at 23:54
  • Relatedly, even procexp is useless here because it seems to report all network connections *by process* - which is exactly what we don't have. – hmijail Nov 23 '20 at 03:17
-2

Your question look similar to :


As you said:

Lastly, I tried to restart the loopback interface:

$ sudo ifconfig lo down

$ sudo ifconfig lo up

Did you try to retart all avaible network interfaces (lan or wlan) and not only the loopback) ?

Instead of ifconfig you may use also native MacOS command utility (from here) to power off then power on the device itself (adapt en0 to your device name):

networksetup -setairportpower en0 off
networksetup -setairportpower en0 on

You may also finally try to release and renew DHCP with:

sudo dhclient -v -r

Regards

Community
  • 1
  • 1
A. STEFANI
  • 6,707
  • 1
  • 23
  • 48
  • Mac has many interfaces, I turned off all other network interfaces down, but when turn of en5, it failed, and bug still exist. `$ sudo ifconfig en5 down ifconfig: down: permission denied` – lidaobing Jan 21 '18 at 12:38
  • not works, check https://gist.github.com/lidaobing/e7862923e93f38ccb047848cb7a456e3 – lidaobing Jan 22 '18 at 04:23