I am currently writing a mini project with C to better understand TCP, TLS, HTTP methods, and C itself.
Here is a simplified snippet of the GET portion of my program (no error checking, removed OpenSSL functions):
void htmlGET(char * path, char * address, int sockfd) {
struct pollfd fds[1];
fds[0].fd = sockfd;
fds[0].events = POLLIN | POLLHUP | POLLERR;
char * header;
header = malloc(strlen(address)+50);
sprintf(header, "GET %s HTTP/1.1\r\nHost: %s\r\n\r\n", path, address);
write(sockfd, header, strlen(header));
char buf[BUFSIZE];
int rcount;
while(1) {
poll(fds, 1, 0);
if (fds[0].revents & (POLLHUP | POLLERR)) { break; }
else if (fds[0].revents & POLLIN){
rcount = read(sockfd, buf, sizeof(buf));
write(1, buf, rcount);
}
}
}
My program performs a GET request and receives data just fine without polling. However, I've found that some websites will send the header, and then the rest of the HTML in another message, so I decided to implement polling to receive everything. However, whenever I run this code, the program loops indefinitely, and I haven't been able to find the root cause. Any suggestions on what might be wrong?
Update: The program seems to work to an extent. I found that it does finish at some point in time, so I decided to run time(1)
on it. Here's a sample result:
3.33s user 23.74s system 20% cpu 2:14.94 total
Any ideas on why it's so slow? Sometimes the HTML loads up instantly and the program polls a very long time, sometimes the program polls a very long time and then the HTML loads up.