2

I am building a http proxy in c. The proxy is supposed to filter some keywords in the URL and in the html content. The first problem I have is with the send() function. When I am loading the page for the first time all is fine and dandy. And if I let the page finnish loading, the next request is also fine. But if I open www.google.com and start to type the "instant-feature" is making a new request before the last one is complete and i get the following error:

Program received signal SIGPIPE, Broken pipe.
0x00007ffff7b2efc2 in send () from /lib/x86_64-linux-gnu/libc.so.6
(gdb) up
#1  0x0000000000401f1a in main () at net-ninny2.c:232
232      bytes_sent += send(i, buffer+bytes_sent, buffer_size-bytes_sent, 0);

The code-block that generates the error looks like this:

while(bytes_sent < buffer_size) {
  bytes_sent += send(i, buffer+bytes_sent, buffer_size-bytes_sent, 0);
  printf("* Bytes sent to Client: %d/%d\n", bytes_sent, buffer_size);
}

If you think it's relevant i'll be happy to provide more code.

My second problem is related to Http headers. Since I want to filter keywords in the html content, I don't want the content to be encoded. Google doesn't seem to agree with that and no matter what I put in the Accept-Encoding -header, I always get the content back encoded in gzip. Any ideas how to get rid of that?

EDIT:

I am also trying to use fork() to create child processes for the new connections, but that just throws a nasty error:

select: Interrupted system call

I have put it where I create a new file descriptor from a incoming connection:

if (i == listener) {
          // New connection
          remote_addr_len = sizeof remote_addr;
          newfd = accept(listener, (struct sockaddr *)&remote_addr, &remote_addr_len);

          if (newfd == -1) {
            perror("accept");
          }
          else {
            FD_SET(newfd, &master); // Add new connection to master set
            if (newfd > fdmax) {
              fdmax = newfd;
            }
            printf("* New connection from %s on "
                   "socket %d\n",
                   inet_ntop(remote_addr.ss_family, 
                             get_in_addr((struct sockaddr*)&remote_addr),
                             remoteIP, INET6_ADDRSTRLEN), newfd);
            if(!fork()) {
              fprintf(stderr, "!fork()\n");
              close(newfd);
              exit(5);
            }
          }
        }

But I'm guessing I am doing it all wrong.

Cheers!

1 Answers1

1

For your first question, you will want to ignore the SIGPIPE signal:

signal(SIGPIPE, SIG_IGN);

See How to prevent SIGPIPEs (or handle them properly) for more detail. If you ignore the signal and the socket connection is reset, you will also want to handle the -1 error return value from send() appropriately.

For your second question, you may not be able to force Google to send data uncompressed, since Google may assume that all browsers can handle compressed data. You will probably need to embed a gzip decompressor in your proxy. It's certainly not fair to increase the bandwidth requirements of both ends just because you want to filter some keywords.

Community
  • 1
  • 1
Greg Hewgill
  • 951,095
  • 183
  • 1,149
  • 1,285
  • Ok, so I can just put the "signal(SIGPIPE, SIG_IGN);" somewhere in the beginning of the code and then all SIGPIPEs will make the send() die with -1? And in that case, should I just close the file descriptor since it seams to be broken? Or should I just do a shutdown on it to prevent further writing? I am also trying to implement fork() to handle the new connections as child processes. Can I handle the SIGPIPEs there instead? As for the google problem, I guess I can filter the GET request instead.. – Joakim Kvarnström Feb 10 '12 at 00:29
  • @JoakimKvarnström: When you get a -1 from `send()`, you can close the socket, there's nothing more you can usefully do with it. I wouldn't try to handle the SIGPIPE signal at all, it's a lot easier to check the return value from `send()` than to do anything useful in a SIGPIPE handler. Basically SIGPIPE exists to terminate simple filters like `grep` when the input or output pipe goes away. – Greg Hewgill Feb 10 '12 at 00:38
  • I tried to use the signal(SIGPIPE, SIG_IGN); and close the file descriptor when I got -1 from send. It still generates the error, though. – Joakim Kvarnström Feb 10 '12 at 00:52