1

Scenario:

Before updating at a scheduled time, a web page has a HTTP status code of 503. When new data is added to the page after the scheduled time, the HTTP status code changes to 200.

Goal:

Using a non-blocking loop, to detect this change in the HTTP status code from 503 to 200 as fast as possible. With the current code seen further below, a WHILE loop successfully listens for the change in HTTP status code and prints out a success statement. Once 200 is detected, a break statement stops the loop.

However, it seems that the program must wait for a response every time a HTTP request is made before moving to the next WHILE loop iteration, behaving in a blocking manner.

Question:

Using libcurl C++, how can the below program be modified to transmit requests (to a single URL) to detect a HTTP status code change without having to wait for the response before sending another request?

Please note: I am aware that excessive requests may be deemed as unfriendly (this is an experiment for my own URL).

Before posting this question, the following SO questions and resources have been consulted:

What's been tried so far:

  • Using multi-threading with a FOR loop in C to repeatedly call function to detect HTTP code change, which had a slight latency advantage. See code here: https://pastebin.com/73dBwkq3
  • Utilised OpenMP, again when using a FOR loop instead of the original WHILE loop. Latency advantage wasn't substantial.
  • Using the libcurl documentation C tutorials to try to replicate a program that listens to just one URL for changes, using the asynchronous multi-interface with difficulty.

Current attempt using curl_easy_opt:

#include <iostream>
#include <iomanip>
#include <vector>
#include <string>
#include <curl/curl.h>

// Function for writing callback
size_t write_callback(char *ptr, size_t size, size_t nmemb, void *userdata) {

        std::vector<char> *response = reinterpret_cast<std::vector<char> *>(userdata);
        response->insert(response->end(), ptr, ptr+nmemb);
        return nmemb;
}

long request(CURL *curl, const std::string &url) {

        std::vector<char> response;
        long response_code;

        curl_easy_setopt(curl, CURLOPT_URL, url.c_str());
        curl_easy_getinfo(curl, CURLINFO_RESPONSE_CODE, &response_code);
        curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION, write_callback);
        curl_easy_setopt(curl, CURLOPT_WRITEDATA, &response);

        auto res = curl_easy_perform(curl);

        if (response_code == 200) {
                std::cout << "SUCCESS" << std::endl;
        }

        return response_code;
}

int main() {
    curl_global_init(CURL_GLOBAL_ALL);
        CURL *curl = curl_easy_init();

       while (true) {
               long response_code = request(curl, "www.example.com");
               if (response_code == 200) {
                       break; // Page updated
               }
       }

        curl_easy_cleanup(curl);
        curl_global_cleanup();
    return 0;
}

Summary:

Using C++ and libcurl, does anyone know how a WHILE loop can be used to repeatedly send a request to one URL only, without having to wait for the response in between sending requests? The aim of this is to detect the change as quickly as possible.

I understand that there is ample libcurl documentation, but have had difficulties grasping the multi-interface aspects to help apply them to this issue.

p.luck
  • 646
  • 2
  • 9
  • 34
  • `transmit requests ... without having to wait for the response before sending another request` - if the server processes say 10 requests/sec and you are sending 100 requests/sec you will eventually either DOS server completely or run out of local sockets/buffers. we usually want the limit of pending requests count if we expect the server to be a bottleneck. – dewaffled Jun 08 '21 at 23:30
  • Well you have to wait for the response one way or another, either in `recv()` in blocking mode, or in `select()` in non-blocking mode. Just spin-looping around `recv()` in non-blocking mode without a `select()` doesn't buy you anything except a smoked CPU. It won't make the HTTP response code arrive any faster. Your question is founded on a false assumption. – user207421 Jun 09 '21 at 02:06
  • Thanks both for your replies. @user207421 I think what I was trying to attempt was to send a request (call it "request 1"), and as "request 1" travels to the dest URL, a "request 2" is sent just before "response 1" is received from "request 1". So almost a "staggered" approach where the loop isn't blocked and can transmit two requests without having to wait for the response in between the two request transmissions. In a blocking loop, e.g. I would send 5 requests/sec. With the non-blocking loop, I'd try to double that to 10 requests/sec. Could this be possible with the `curl multi interface`? – p.luck Jun 09 '21 at 12:27

1 Answers1

2
/* get us the resource without a body - use HEAD! */
curl_easy_setopt(curl, CURLOPT_NOBODY, 1L);

If HEAD does not work for you, the server may reject HEAD, another solution:

size_t header_callback(char *buffer, size_t size, size_t nitems, void *userdata) {
  long response_code = 0;
  curl_easy_getinfo(curl, CURLINFO_RESPONSE_CODE, &response_code);
  if (response_code != 200)
    return 0;  // Aborts the request.
  return nitems;
}

curl_easy_setopt(curl, CURLOPT_HEADERFUNCTION, header_callback);

The second solution will consume network traffic, the HEAD is much better, once you receive 200, you can request GET.

273K
  • 29,503
  • 10
  • 41
  • 64
  • Thanks for your response. I should have added that I would need the body of the message later on once the status code of 200 was detected. – p.luck Jun 08 '21 at 23:21
  • You have the strange `curl_easy_getinfo(curl, CURLINFO_RESPONSE_CODE, &response_code)` before the request performed. – 273K Jun 08 '21 at 23:26