How to iterate lines from a file and read fields into variables

Question

I have a file hotfix_final that looks like this:

https://download.abc.com  06/24/2019
https://download.abc.com  06/26/2019
https://download.abc.com  07/05/2019

I need to write a shell script that reads the file line by line and then takes the date that is next to the link into a variable so that I can compare it to the current date. If the dates are equal and usage_type = 4, I need the script to wget the link.

What I tried so far:

usage_type=$( cat /opt/abc/ps/usage.txt )
current_date=$( date +%x )
lines=$( wc -l /home/abc/hotfix_final | awk '{print $1}' )

count=0

while $count <= $lines; do

    hf_link=$( awk if( NR=$count ) '{print $1}' hotfix_final )
    relase_date=$( awk if( NR=$count ) '{print $2}' hotfix_final )
    count=$(( count+1 ))

done < hotfix_final

In the above example, I used:

$lines to show the max number of lines to read.
$hf_link to get the link
$release_date to get the date next to $hf_link

Now, I am not sure how to write the part that checks whether $usage_type == 4 and $current_date = $relase_date are true, and if so, wget the link. This needs to be done individually for each line of the file.

There are many many many mistakes in this script. **(1)**: quote your awk properly : `awk '{if( NR==$count ) print $1}'` **(2)** if statements should do comparisons not assignments: `if(NR == $count)` **(3)** shell variables should be passed to awk correctly : `awk -v count=$count '{if(NR == count) print $1}'` **(4)** don't parse a file in bash with a while loop the way you do:, just use a proper while in combination with read, you can skip the awk to extract the variables. See answer of [Léa Gris](https://stackoverflow.com/a/57608652/8344060) — kvantour, Aug 22 '19 at 12:08
See https://stackoverflow.com/a/38627863/874188 for why reading line numbers is very often an antipattern. — tripleee, Aug 22 '19 at 12:22

Léa Gris · Accepted Answer · 2019-08-22T12:34:14.250

3

Can be done with a few fixes to your script:

You need to take care of quoting your variables to avoid values from being split on space or any characters listed in the $IFS variable.
date +%x would return a date with a different format on a system with different locale setting in the $LC_TIME environment variable.
The %x format would be MM/DD/YYY when setting LC_TIME=en_US, but there is a tiny (while admittedly very unlikely) chance that the en_US locale may not be available to the system.
It is then preferable to use an explicit locale-independent format of +%d/%m/%Y, to be safe on the date format.

Here is a fixed version:

#!/usr/bin/env bash
# Early exit this script if the usage.txt file does not contain the value 4
grep -Fqx 4 /opt/abc/ps/usage.txt || exit

# Store current date in the MM/DD/YYYY format
current_date="$(date +%d/%m/%Y)"

# Iterate each line from hotfix_final
# and read the variables hf_link and release_date
while read -r hf_link release_date; do
  if [ "$current_date" = "$release_date" ]; then
    wget "$hf_link"
  fi
done </home/abc/hotfix_final # Set the file as input for the whole while loop

edited Aug 22 '19 at 12:34

answered Aug 22 '19 at 11:47

Léa Gris

17,497
4
32
41

but how does shell know that for each row of the ```hotfix_final``` file ```$hf_link``` represents the links while ```$release_date``` represents the dates next to the links? – Bogdan Aug 22 '19 at 11:54
The `read` command reads one line at a time. Then when presented with multiple variables to read, it uses spaces or the characters defined into the `$IFS` environment variable to delimit fields within the line and assign values to variables in the same order as the read variables arguments. – Léa Gris Aug 22 '19 at 11:59
1

You want to exit early if `usage_type` is not 4, instead of compare it again and again inside the loop. – tripleee Aug 22 '19 at 12:19
You only moved it, it's still being compared on every iteration. – tripleee Aug 22 '19 at 12:25
Sorry for the `fgrep -f` typo in my answer, you want to fix that here too. – tripleee Aug 22 '19 at 12:33

tripleee · Answer 2 · 2019-08-22T12:33:19.507

1

Here's a refactoring of the accepted answer to avoid the fugly while read -r loop.

#!/bin/sh

grep -Fqx 4 /opt/abc/ps/usage.txt || exit

awk -v current_date="$(date +%d/%m/%Y)" '
    $2 == current_date { print $1 }' /home/abc/hotfix_final |
xargs -r -n 1 wget

The -r option to xargs is a GNU extension; if you don't have it, it's not critical, but helps avoid an error message when the Awk script does not produce any output.

In your next project, you want to make sure you use a less insane date format in computer-readable files.

edited Aug 22 '19 at 12:33

answered Aug 22 '19 at 12:21

tripleee

175,061
34
275
318

I guess you could do it in a stand-alone awk script with its own `#!/usr/bin/env awk` shebang – Léa Gris Aug 22 '19 at 12:27
I started with the `usage.txt` handling inside the Awk script too but it made less sense to someone who is probably not very familiar with Awk so I wanted to keep it very simple and structured. – tripleee Aug 22 '19 at 12:32

score 1 · Answer 3 · answered Aug 22 '19 at 17:24

This might work for you (GNU Parallel):

[ $(<usageFile) -eq 4 ] && 
parallel -a fixFile -C' +' [ {2} = $(date +%m/%d/%Y) ] \&\& wget {1}

Use test to interrogate the useage file and if set to 4 use parallel to complete the task. Parallel uses the fix file and the -C option with the regexp of one or more spaces to name the columns in the file {1} as the url and {2} as the date. Test is used again to check the date column against todays date and a match will wget the url.

How to iterate lines from a file and read fields into variables

3 Answers3