1

I'm using the following command to curl an url, and showing the response_code and time_total provided by the --write-out option

curl -o /dev/null -sL $@ -w "$(printf %-100.100s $@) %{response_code}\t%{time_total}\n"
# https://example.com/?p=5                                200            0.084437

Since some pages show a cms based 404, I would like to check the page content for a particular string, and showing a contains: true/false in the output:

https://example.com/?p=4                  No               200           0.084437
https://example.com/?p=5                  Yes              200           0.081241

So I guess the real question is, how do I get both the -w to show his output, while saving the response body to a variable?

Based on this answer, I'm currently using this workaround:

# <body>\n<code>#<time>
c=$(curl -o - -s -L ${1} -w "%{response_code}#%{time_total}")

# Body
cb=$(sed '$ d' <<< "$c")

# Info
ci=$(tail -n1 <<< "$c")
IFS='#' read -r -a cs <<< "$ci"

# CMS-404
ul=<regex magic>
[[ $ul != ${2} ]] && pe="CMS-404" || pe=""

# Print result
printf "$(printf %-100.100s ${1})\t${pe}\t\t${cs[0]}\t${cs[1]}\n"

It's captures the complete output, get's the last line, witch contains the -w information, and using sed to get the remaining content, the body.

Since I would like the -w output to (optionally) be multiple lines, I'm looking for a more reliably way to do this.


Linked question/answers

Additional information

Version
curl 7.64.1
bash 3.2.57(1)-release
OS Mac OS X 10.15.7

Edit, based on @Philippe's answer, this seems to work, however, the ci="${c#*$'\1'}" is extreme slow, since I'm using this function for 1000+ url's, I'm hoping for a 'faster' solution.

I've tested this with the below script, simplified;

#!/bin/bash

get() {

    echo "---> ${1}"

    # Request
    c=$(curl -o - -s -L ${1} -w $'\1'"%{response_code}#%{time_total}")

    # Body
    cb="${c%$'\1'*}"
    time_after_body=$(date +%s%N)

    ci="${c#*$'\1'}"
    time_after_info=$(date +%s%N)

    time_diff="$((time_after_info - time_after_body))"
    echo "TimeDiff: ${time_diff}ms"
    echo "CurlInfo: ${ci}"
}

get "https://stackoverflow.com/questions/65395089/save-curl-response-and-w-to-variables"
 ✗ ./small_so.sh
---> https://stackoverflow.com/questions/65395089/save-curl-response-and-w-to-variables
TimeDiff: 9886412000ms
CurlInfo: 200#0.469687
0stone0
  • 34,288
  • 4
  • 39
  • 64

1 Answers1

1

Can you use a marker like $'\1' ?

#!/usr/bin/env bash
  
c=$(curl -o - -s -L ${1} -w $'\1'"%{response_code}#%{time_total}")

cb="${c%$'\1'*}"

ci="${c#*$'\1'}"

declare -p cb ci

# Calling method : ./test.sh "https://example.com/?p=5"

I have just tested on macos :

~/$ /bin/bash --version
GNU bash, version 3.2.57(1)-release (x86_64-apple-darwin20)
Copyright (C) 2007 Free Software Foundation, Inc.

~/$ /bin/bash test.sh "https://example.com/?p=5"
declare -- cb="<!doctype html>
...
"
declare -- ci="200#0.693779"
Philippe
  • 20,025
  • 2
  • 23
  • 32
  • Unfortunately the code hangs on `ci="${c#*$'\1'}"`, never used those markers before. Are there any bash version limitations? Thanks for the attempt! – 0stone0 Dec 22 '20 at 11:42
  • Updated my answer. `bash 3.2.57` should support the syntax in my script. Can you share command you use ? – Philippe Dec 22 '20 at 13:41
  • If updated my question. Got it working with your example. Seems like a solid solution, many thanks. However, could you please take a look at the example script in the OP, I'm the `ci="${c#*$'\1'}"` seems to take quite some time. – 0stone0 Dec 29 '20 at 17:36
  • Indeed, it's quite slow on my side as well (18 seconds). Can you try `ci="${c##*$'\1'}"` with double `#` – Philippe Dec 29 '20 at 17:48
  • Wouw, thats quite some difference haha. About 0.8 seconds for a single request. Do you mind explaining why this has such a huge impact? – 0stone0 Dec 29 '20 at 20:20
  • ##*$'\1' removes longest string until $'\1' whereas `#*$'\1'` removes shortest. They should be equivalent in this case, so I don't know either why there is this huge impact. – Philippe Dec 29 '20 at 22:07