6

When I use nc to listen a port , it shows

nc -l -vv -p 21000

retrying local 0.0.0.0:21000 : Address already in use Can't grab 0.0.0.0:21000 with bind

But I can not find which task occupy's this port with tools netstat / ss

netstat -an|grep 21000 

;nothing find

ss -a|grep 21000 

;nothing find

This port is occupied by my java program, the code is :

public class Test1 {

        public static void main(String[] args) throws InterruptedException {
        Socket s = new Socket();
        try {
            s.bind(new InetSocketAddress("127.0.0.1",21000));
        } catch (IOException e) {
            e.printStackTrace();

        }
        Thread.sleep(500000000000L);
    }
}

when I bind a socket ,but do not use it with connect or listen. I get into the /proc/[java task id]/fd , find the inode of this socket is "socket:[3073501]" but I can't find the inode or port even in /proc/net/tcp or /proc/net/tcp6

Is there any method to find the process which bind's the socket but does not listen or connect.

Thanks.

I see linux 3.10.0-327 source code. I think the content of the file /proc/net/tcp come from the net/ipv4/tcp_ipv4.c.

in tcp_proc_register method,

static void *tcp_get_idx(struct seq_file *seq, loff_t pos)      
{
        void *rc;
        struct tcp_iter_state *st = seq->private;

        st->state = TCP_SEQ_STATE_LISTENING;
        rc        = listening_get_idx(seq, &pos);

        if (!rc) {
                st->state = TCP_SEQ_STATE_ESTABLISHED;
                rc        = established_get_idx(seq, pos);
        }

        return rc;
}

It shows only the socks in listening or established from tcp_hashinfo. But tcp_hashinfo has three struct

struct inet_bind_hashbucket     *bhash; 
struct inet_listen_hashbucket   listening_hash[INET_LHTABLE_SIZE];
struct inet_ehash_bucket        *ehash;

bhash may be used for binding. But is does not export in /proc/net/tcp.

Xiing.Liu
  • 77
  • 1
  • 3
  • 1
    Wow. I can reproduce this issue exactly as described, and it's not a simple confusion with `/proc/net/tcp` style hex notation or `/etc/services` based names. I can also not find the port number with `lsof`. `nc -l` still says it's in used, exactly as posted. – that other guy Jun 14 '18 at 03:49
  • The simple solution is not to write code like that, surely? Normally you will connect a `Socket` immediately, and there is usually very little point in binding it at all. – user207421 Jun 14 '18 at 04:53
  • Possible duplicate of [How to investigate ports opened by a certain process in linux?](https://stackoverflow.com/q/942824/608639), [Find original owning process of a Linux socket](https://stackoverflow.com/q/2358518/608639), [How to kill a process running on particular port in Linux?](https://stackoverflow.com/q/11583562/608639), [How do I find and kill process running on a certain port?](https://superuser.com/q/322363/173513), etc. – jww Jun 14 '18 at 17:20
  • 1
    @jww Surprisingly no, none of these apply – that other guy Jun 14 '18 at 22:24

1 Answers1

2

I tested your Java program under Ubuntu.

How to find a process that binds the socket but does not listen or connect:

lsof

lsof | grep "can't identify protocol"

You will get a result like:

COMMAND     PID   TID       USER   FD      TYPE             DEVICE SIZE/OFF    NODE NAME
java      29644 29653    stephan   12u     sock                0,7      0t0  312066 can't identify protocol

Please note the TYPE sock and the NAME can't identify protocol.

How does this work? Take a look into the FAQ of lsof:

Why does /proc-based lsof report "can't identify protocol" for some socket files?

/proc-based lsof may report:

  COMMAND PID ... TYPE ... NODE NAME
  pump    226 ... sock ...  309 can't identify protocol

This means that it can't identify the protocol (i.e., the AF_* designation) being used by the open socket file. Lsof identifies protocols by matching the node number associated with the /proc//fd entry to the node numbers found in selected files of the /proc/net sub-directory.

...

You may not be able to find the desired node number, because not all kernel protocol modules fully support /proc/net information.

Verify Process

The PID in the lsof output was 29644.

ls -l /proc/29644/fd   

which results in:

...
lrwx------ 1 stephan stephan 64 Jul  7 22:52 11 -> socket:[312064]
lrwx------ 1 stephan stephan 64 Jul  7 22:52 12 -> socket:[312066]
...

and

grep 312066 /proc/net/*

gives an empty result.

Stephan Schlecht
  • 26,556
  • 1
  • 33
  • 47
  • 1
    My `lsof` shows `27401296 protocol: TCP` for the example and `TCP *:21000 (LISTEN)` otherwise (no "can't identify protocol"). It's not ideal since it doesn't allow seeing which program reserves which port, but at least this allows identifying some candidates! – that other guy Jul 09 '18 at 18:47
  • Yes, with the lsof output of the PID in the second column you only get the potential candidates. But there shouldn't be too many. The problem should only occur in two situations: Socket is bound but not listening/connected or sockets are not closed properly after usage. I find your lsof output interesting. I need to see if I can update my lsof version tomorrow. – Stephan Schlecht Jul 09 '18 at 21:32
  • 1
    This is 4.89 from Debian testing. – that other guy Jul 09 '18 at 21:45