Skip a get request in perl if it takes more than a specified time

Question

I am writing a perl script to extract certain data using curl commands. EG:

my $raw_json = `curl -X GET <some website url> -H <some parameters>`;

The issue is sometimes this website crashes and my code gets stuck at the same place for a long time. I want the code to skip this line and go to the next line if the request is taking more than a specified time, say 30 seconds.

I tried using $SIG{ALRM} in my script as follows:

my $timeout = 30;
eval {
   local $SIG{ALRM} = sub { die "alarm\n" }; # NB: \n required
   alarm $timeout;
   my $raw_json = `curl -X GET <some website url> -H <some parameters>`;
   alarm 0;
};
if ($@) {
 print "\nERROR!\n";
   die;   # propagate unexpected errors
      # timed out
} 
else {
   # didn't
}

I expected the run to stop after 30 seconds, but what is happening is I do get the "ERROR" statement printed after 30 seconds but get request keeps on running even after that.

Very related: https://unix.stackexchange.com/questions/94604/does-curl-have-a-timeout TL;DR Use the curl options `--connect-timeout` and `--max-time`. — Ted Lyngmo, May 09 '23 at 10:30
Whenever you are using a system call to do something, know that it most often is a shortcut. Sometimes you run into issues with your shortcut, and you start problem solving and realize it is no longer a shortcut. That is what is happening here. You should look for a perl library to do the job for you. I googled and found [`WWW::Curl`](https://metacpan.org/pod/WWW::Curl), for example, but I don't know if it is a good one. — TLP, May 09 '23 at 10:42

brian d foy · Accepted Answer · 2023-05-09T19:50:40.273

3

The curl is happening in a subprocess, so you need to stop that subprocess. Perl isn't going to stop that for you.

Use the --connect-timeout or --max-time to curl so you don't need the alarm and curl cleans itself up.

As @ikegami suggested, the next simplest thing is IPC::Run, which can handle the details of a timeout for an external process.

Or, if you want to handle the alarm yourself, you need to work at a lower level so you have the PID of the subprocess and can kill it yourself. See perlipc.

edited May 09 '23 at 19:50

answered May 09 '23 at 11:59

brian d foy

129,424
31
207
592

1

To avoid going to a lower lever, you could switch to IPC::Run which supports timeouts. The first idea sounds best, though. – ikegami May 09 '23 at 13:33

zdim · Answer 2 · 2023-05-15T08:26:45.743

Best to set curl's own timer, and can do that using Perl's libcurl wrapper, Net::Curl::Easy

use warnings;
use strict;
use feature 'say';

use Net::Curl::Easy qw(:constants); 

my $curl = Net::Curl::Easy->new; 
$curl->setopt(CURLOPT_URL, "www.example.com" );  
$curl->setopt(CURLOPT_TIMEOUT, 30); 

$curl->perform;

See constants in curlopt_easy_setopt, or the straight-up list in easy_setopt_options. Here I use CURLOPT_TIMEOUT, for the whole operation, while there is CURLOPT_CONNECTTIMEOUT to consider as well. There are yet other timeouts.

This module uses C-style interface but then again this will be familiar to curl's use.

A more realistic use, with a returned document stored in a variable and perform checked for errors (most methods throw on errors)

my $curl = Net::Curl::Easy->new; 
$curl->setopt(CURLOPT_URL, "www.example.com" );  
$curl->setopt(CURLOPT_WRITEDATA, \my $response);  # or declare earlier
$curl->setopt(CURLOPT_TIMEOUT, 30); 

eval { $curl->perform };
if ($@ and ref $@ eq "Net::Curl::Easy::Code" ) {
    die "curl eval-ed: $@";
}
elsif ($@) { die $@ }  # probably not curl error, re-raise

say $response;

In newer Perls we have nicer exception handling ways, of try-catch style. See for example this post for an example and links.

score 1 · Answer 3 · answered May 09 '23 at 12:33

Other approach may be with LWP::UserAgent module, so you have a higher control on what is happening with your request, define timeouts, and all you need to send a request and analyze the response.

HTTP::Tiny is an alternative designed for doing simple requests. In both cases, module installation is required, but it's an easy task.

Skip a get request in perl if it takes more than a specified time

3 Answers3