0

This is the code I have so far, I essentially have a fd sock_comm which is the listening socket, then multiple client socket connections will be established and added. I need to be able to accept new connections but also receive on existing ones, so I am using select. I am running into a problem when trying to iterate through existing connections where the while loop is restarting unexpectedly.

int n = sock_comm+1;
int max_sock;
while (1) {
    max_sock = n;
    printf("sock_comm: %d\n", sock_comm);

    // Clear the fd_set and add back connections
    FD_ZERO(&readfds);
    for (int i=sock_comm; i < max_sock; i++) {
        FD_SET(i, &readfds);
    }
    select(n, &readfds, NULL, NULL, NULL);
    printf("max socket: %d\n", max_sock);
    fd = sock_comm;

    // Iterate through each connection
    while (fd < max_sock) {
        printf("fd: %d\n", fd);
        printf("sock_comm2: %d\n", sock_comm);

        // If activity on current connection
        if (FD_ISSET(fd, &readfds)) {
            printf("Connection on %d\n", fd);

            // If connection on listener, accept 
            if (fd == sock_comm) {
                if ((new_s = accept(sock_comm, (struct sockaddr *)&client_addr, &len)) < 1) {
                    perror("tcpserver: accept");
                    exit(1);
                }
                n++;
                client_info.sockid = new_s;
                // Join handler thread                                    
                pthread_create(&threads[0],NULL,join_handler,&client_info);
                pthread_join(threads[0],&exit_value);
                printf("sock_comm3: %d\n", sock_comm);
            } else {
                // Create and join thread for receiving data
            }
        }
        fd++;
        printf("sock_comm4: %d\n", sock_comm);
    }
 }

I would expect the output after an initial connect to be as such:

sock_comm: 3
max socket: 4
fd: 3
sock_comm2: 3
Connection on 3
sock_comm3: 3
sock_comm4: 3
sock_comm: 3

Then the select process begins again. What is happening though is this:

sock_comm: 3
max socket: 4
fd: 3
sock_comm2: 3
Connection on 3
sock_comm3: 3
sock_comm4: 3
fd: 1
sock_comm2: 3
sock_comm4: 3
fd: 2
sock_comm2: 3
sock_comm4: 3
fd: 3
sock_comm2: 3
Connection on 3

As you can see, the while (fd < max_sock) loop is resetting and somehow starting at 1 but fd never should drop below 3. I don't know what's wrong with my code, does it have to do with using select in a loop? I've never used it before, but it's causing odd behavior. It gets stuck in an accept state when there is nothing to accept.

Join handler:

void *join_handler(struct global_table *rec) {
        int newsock;
        struct packet packet_reg;
        newsock = rec->sockid;

        for (int i=0; i < REQUEST_NO; i++) {
                if (recv(newsock, &packet_reg, sizeof(packet_reg), 0) < 1) {
                        printf("Could not receive RG-%d\n", i+1);
                        exit(1);
                } else {
                        printf("Registered received: %s\n",packet_reg.data);
                }
        }

        // add to table
        pthread_mutex_lock(&my_mutex);
        counter++;
        record[counter].sockid = newsock;
        pthread_mutex_unlock(&my_mutex);

        // send response acknowledging registration
        packet_reg.type = htons(221);
        sprintf(packet_reg.data, "%d", newsock);
        if(send(newsock, &packet_reg, sizeof(packet_reg), 0) < 1) {
                printf("ACK send failed\n");
                exit(1);
        }
        pthread_exit(NULL);
}
TomNash
  • 3,147
  • 2
  • 21
  • 57
  • It's normal to call `select()` in a loop like this. I don't see anything in the code you posted that could set `fd` less than `sock_comm`. Maybe there's erroneous code in the other thread that's overwriting the variable? – Barmar Mar 23 '17 at 01:41
  • 1
    What happens if you make the variable `fd` local to the `while(1)` loop? – Barmar Mar 23 '17 at 01:42
  • The other thread only works with local variables and sends an acknowledgement using the new socket. Making `fd` local to `while(1)` changes the behavior definitely. Now I'm getting mysterious results in the `recv` on the new `fd` created from the accept when no packets are being sent from the client. – TomNash Mar 23 '17 at 01:52
  • `readfds` will _only_ have `sock_comm` in it. Thus, your `recv` will _never_ execute [and you don't want to use `recv` there]. You don't show the thread func, but you'd want to have `main` do `accept` and pass the result to `pthread_create` and have the function do `recv` on it. Also, you want something better than an _immediate_ `pthread_join`. You may want the thread func to run _detached_ so you get parallelism. Note that if you need to pass `client_info` you have to `malloc` one [and thread func does `free`] or you get a race condition. Or, just use `(void *) new_s` – Craig Estey Mar 23 '17 at 01:54
  • @CraigEstey Why do you think it will only have `sock_comm` in it? He has a loop at the top that adds all the FDs from `sock_comm` to `max_sock` to `readfds`. – Barmar Mar 23 '17 at 01:56
  • You probably shouldn't be putting the sockets returned by `accept` into `readfds`. Isn't the other thread supposed to be handling all the communication on those sockets? – Barmar Mar 23 '17 at 01:58
  • 2
    I'm not really sure why you're using `select()` if you're creating a thread for each client. The whole point of the threads is to simplify the multiplexing. – Barmar Mar 23 '17 at 02:00
  • The other thread does registration into a global table by receiving registration packets and sending acknowledgment when received. That goes beyond the scope of this problem. The `recv` block would, when finished, spawn a thread to add the message to a buffer which would be disseminated to subscribed clients by another multicaster, but again is outside the scope of this current problem. This main loop is to listen for activity on each socket, then start the appropriate thread to receive messages or add to registration. – TomNash Mar 23 '17 at 02:08
  • @Barmar Because [AFAICT] `n = sock_comm+1; max_sock = n; for (i = sock_comm; i < max_sock; ++i)` aka `for (i = sock_comm; i < (sock_comm + 1); ++i)` – Craig Estey Mar 23 '17 at 02:08
  • @CraigEstey You missed `n++;` later in the loop. So the next time around, `max_sock = n;` uses the higher value. – Barmar Mar 23 '17 at 02:11
  • @Barmar Yes, you're right about `n++` [and I _did_ look for one]. But, even so, I'm not sure how this helps OP as there's no guarantee that (e.g.) adding `sock_comm+1` [or `+2`, etc] to the `fd_set` has meaning. I'd say adding `new_s` _might_. But, as I said `new_s` should just be the arg to `pthread_create` and let the thread handle all communications on the `new_s` socket. – Craig Estey Mar 23 '17 at 02:20
  • @CraigEstey Yes, this code is poorly designed. The proper way to manage multiple FDs returned by `accept()` is to have an array of all the FDs, then add those to the `fd_set`, rather than assuming that FDs will be sequential. – Barmar Mar 23 '17 at 02:22
  • Another way this is commonly done is to have a master `fd_set` variable, which you call `FD_SET` on when you get a new FD. Each time through the `select()` loop you assign the master variable to the local variable. – Barmar Mar 23 '17 at 02:23
  • @Barmar, yeah I imagine this isn't the best way to do it, I've never done this type of programming before and am piecing together bits of Beej's guide. The final result will have an array of FD's rather than sequential, but for now this works. I'll attempt your suggestion at local and master `FD_SET` – TomNash Mar 23 '17 at 02:28
  • @Barmar OP's code is mix of code to have one [master] thread use `select` and handle connection requests if `fd == sock_comm` and do `recv` to do data transfer. But, what goes in the thread(?), since `new_s` goes to the thread, but it's not valid otherwise (i.e. the `recv` _should_ use it [and _not_ `fd`]. That is, _if_ the `recv` executes at all, it has a _garbage_ [unpredictable] value for `fd` – Craig Estey Mar 23 '17 at 02:29
  • Since changing `fd` to local changed the behavior, I'm guessing the original problem was that `join_handler` was assigning to the global varable `fd`. – Barmar Mar 23 '17 at 02:29
  • `fd` does not appear at all in the join handler, added to end of OP – TomNash Mar 23 '17 at 02:30
  • See my answers: http://stackoverflow.com/questions/37497556/multi-threaded-file-transfer-with-socket/37506686#37506686 and http://stackoverflow.com/questions/35443876/executing-commands-via-sockets-with-popen/35446058#35446058 for some examples – Craig Estey Mar 23 '17 at 02:32
  • suggest, using a `thread pool` as creating/destroying a thread is a (relatively) time consuming process. – user3629249 Mar 23 '17 at 10:28
  • So I'm clearing the FD_SET every pass, but the new socket (created and then passed to the join handler) is stuck on returning true despite there being no activity before the second pass of the select (first pass is accept the client connection then join handler to create socket for that dedicated connection and add to registration table, second pass is now back to listening on the main socket and that client socket). Is there a different function to use for adding/clearing FD_SET? – TomNash Mar 23 '17 at 14:13

0 Answers0