1

this really has me stumped. Here is what I am trying to do:

I try to pipe an article from newsboat to a script. This script should then extract the Title and Url from the article.

Here is an example article:

Feed: NYT > Home Page
Title: Hit Pause on Brett Kavanaugh
Author: THE EDITORIAL BOARD
Link: https://www.nytimes.com/2018/09/26/opinion/kavanaugh-supreme-court-hearing-delay.html?partner=rss&emc=rss
Date: Thu, 27 Sep 2018 01:58:11 +0200

The integrity of the Supreme Court is at stake.

The article gets piped with a macro from newsboat:

macro R pipe-to "cat | ~/.scripts/newsboat_extract"  

Here is the working script:

#!/bin/bash

cat > ~/newsboat         #I do not really need this file, so if I can cut out saving to a file, I would prefer to

title="$(awk -F: '/^Title:/{for(i=2;i<=NF;++i)print $i}' ~/newsboat)"
url="$(awk -F: '/^Link:/{print $2 ":" $3}' ~/newsboat)"
printf '%s\n' "$title" "$url" >> newsboat_result

This delivers the expected output:

Hit Pause on Brett Kavanaugh
https://www.nytimes.com/2018/09/26/opinion/kavanaugh-supreme-court-hearing-delay.html?partner=rss&emc=rss

I would like to avoid saving to a file. However, saving to a variable does - for whatever reason - not work: And this is the script that is not working!

#!/bin/bash

article=$(cat)

title="$(awk -F: '/^Title:/{for(i=2;i<=NF;++i)print $i}' "$article")"
url="$(awk -F: '/^Link:/{print $2 ":" $3}' "$article")"
printf '%s\n' "$title" "$url" >> newsboat_result

the output turns to this:

#empty line
#empty line

I have completely no idea why the script would behave like this. It must have something to do how the variable is stored, right?

Any ideas? - I am pretty new at bash scripting and awk, so thankful also for any comments on how to solve this problem more efficiently.

"""""""""""" " SOLUTION " """"""""""""

This did it, thank you!

#!/bin/bash

article=$(cat "${1:--}")

title="$(awk -F: '/^Title:/{for(i=2;i<=NF;++i)print $i}' <<< "$article")"
url="$(awk -F: '/^Link:/{print $2 ":" $3}' <<< "$article")"
printf '%s\n' "$title" "$url" >> newsboat_result
mor3dr3ad
  • 131
  • 2
  • 11
  • 1
    FWIW, `$(cat single-file)` can be replaced with `$(< single-file)`, which is faster. See the second paragraph of the *Command Substitution* section in the [bash man page](https://manpage.me/?q=bash) – jpaugh Sep 28 '18 at 16:41
  • 1
    `echo $(cat $ARTICLE)` changing your content is as described in [I just assigned a variable, but `echo $variable` shows something else!](https://stackoverflow.com/questions/29378566/i-just-assigned-a-variable-but-echo-variable-shows-something-else) – Charles Duffy Sep 28 '18 at 16:44
  • @jpaugh `$(< $ARTICLE)` wouldn't work in that context since `$ARTICLE` may be empty when input is fed via stdin. – xhienne Sep 28 '18 at 16:47
  • `$(<"${1:-/dev/stdin}")` is less buggy anyhow. `$(cat $ARTICLE)` won't ever work if your filename has whitespace, or the path contains characters in `IFS`, or the path can be expanded as a glob, etc. – Charles Duffy Sep 28 '18 at 16:51
  • @CharlesDuffy Which boils down to what I proposed in my answer. No need to `echo "$(<"${1:-/dev/stdin}")" > ~/newsboat` when you can simply do `cat "${1:- -}" > ~/newsboat` which is safer. – xhienne Sep 28 '18 at 16:56
  • That's **if** the OP really needs their on-disk copy at all. Better if they don't -- multiple copies of the script running at the same time would overwrite the file the other instances are depending on and result in confusion. – Charles Duffy Sep 28 '18 at 16:58

1 Answers1

2

In your script, you are assuming that $ARTICLE is a plain file and you are making several operations on it. First you read it with cat and store the content in ~/newsboat, then you read it again with awk to extract the title, then you read it a third time to extract the URL.

This can't work with standard input; it can only be read once.

A quick fix is to work on the copy of it you made in the first operation:

#!/bin/bash

article=$1
feed_copy=~/newsboat
cat "${article:--}" > "$feed_copy"     # Use stdin if parameter is not provided

title="$(awk -F: '/^Title:/ { for(i=2; i<=NF; ++i) print $i }' "$feed_copy")"
url="$(awk -F: '/^Link:/ { print $2 ":" $3 }' "$feed_copy")"

printf '%s\n' "$title" "$url" >> "$feed_copy"

Not tested, obviously, but that should work.

Notes:

  • reserve uppercase variable names for environment variables (this is a mere convention)
  • you should almost always quote your variables (cat "$article", not cat $article) unless you know what you are doing
  • avoid echo, use printf

There are other enhancements that could be made to this script but sorry, I lack the time.


[edit] Since you don't actually need the ~/newsboat file, here is a updated version that follows Charles Duffy's suggestion:

#!/bin/bash

feed_copy=$(cat "${1:--}")
title="$(awk -F: '/^Title:/ { for(i=2; i<=NF; ++i) print $i }' <<< "$feed_copy")"
url="$(awk -F: '/^Link:/ {print $2 ":" $3}' <<< "$feed_copy")"
printf '%s\n' "$title" "$url"
xhienne
  • 5,738
  • 1
  • 15
  • 34
  • `input=$(cat)` would also work, to store the content in a variable; then you don't need a file, and can run other tools `<<<"$input"`. – Charles Duffy Sep 28 '18 at 16:38
  • @CharlesDuffy The file is needed in the original script. I kept that. No need to create other temporary copies of it on disk. – xhienne Sep 28 '18 at 16:41
  • @xhienne. thanks, but this does not work - output are two empty lines – mor3dr3ad Sep 28 '18 at 19:44
  • 1
    also, I do not really need the file. in fact I would prefer not having a file.. editing my answer accordingly – mor3dr3ad Sep 28 '18 at 19:45
  • @mor3dr3ad Sorry, I did a mistake (`"${article:- -}"` instead of `"${article:--}"`). You must have seen some error message I guess. Answer amended. – xhienne Sep 28 '18 at 20:21
  • thank you, adding `<<<` did it. is this the same thing as cat? – mor3dr3ad Sep 28 '18 at 20:25
  • @mor3dr3ad `<<< "string"` creates a temporary file on disk with "string" as its content, then that temporary file is fed to the command as if you had used `< temp_file`. Your problem was not solved by using `<<<` (my first script should work too) but by storing the content of stdin. – xhienne Sep 28 '18 at 20:29
  • hmm, okay thank you. definitely storing the stdin was my blind spot. however, without <<< it still will not work. seems like I still have a lot to learn about bash scripting... – mor3dr3ad Sep 28 '18 at 20:36