Get data from one file to another (Bash) - Web Scraping

Question

I am doing web scraping with bash. I have these URLs saved in a file called URL.txt.

?daypartId=1&amp;catId=1
?daypartId=1&amp;catId=11
?daypartId=1&amp;catId=2

I want to pass these URL to an array in another file main.sh which would append in the base URL https://www.mcdelivery.com.pk/pk/browse/menu.html**(append here)**. I want to append all the URl in URL.txt file in the end of the base URL one by one.

I have come up with the code to extract the URL from the URL.txt but it is unable to append it to the base URL one by one.

#!/bin/bash
ARRAY=()
while read -r LINE
do
    ARRAY+=("$LINE")
done < URL.txt

for LINE in "${ARRAY[@]}"
do    
    echo $LINE
    curl https://www.mcdelivery.com.pk/pk/browse/menu.html$LINE | grep -o '<span class="starting-price">.*</span>' | sed 's/<[^>]\+>//g' >> price.txt 
done

Just need help with the loop so that i can append different URL in URL.txt file at the end of the base URL in the main.sh file.

Are you asking how to append a string and a variable in bash? Does [this post](https://stackoverflow.com/questions/4181703/how-to-concatenate-string-variables-in-bash) answer your question? — that other guy, Jun 06 '20 at 18:35
No, I actually want to append the URL from another file to the end of the base url so that it can navigate to the website and fetch the tags that I am giving it. — , Jun 07 '20 at 07:34
`ARRAY=() while read -r LINE do ARRAY+=("$LINE") done < URL.txt for LINE in "${ARRAY[@]}" do echo $LINE curl https://www.mcdelivery.com.pk/pk/browse/menu.html$LINE | grep -o '
.*
' | sed 's/<[^>]\+>//g' >> price.txt done` I have come up with this code but the output repeats itself like it only gives the output of the main page can you please spot the error? — , Jun 07 '20 at 09:11
@alecxs when I try your code it gives an error in the URL variable `line 14: https://www.mcdelivery.com.pk/pk/browse/menu.html: No such file or directory` what am I doing wrong here? — , Jun 07 '20 at 09:21
@alecxs I have multiple URLs in the array and I am appending the URL to the base URL from the array in the loop. Looking forward towards your answer — , Jun 07 '20 at 10:30
your code is working fine to me, just add `[ "$LINE" ] && curl` (skip empty lines in URL.txt) — alecxs, Jun 07 '20 at 10:47
@alecxs my code is giving only the out from the one page and its repeating the same output ```Rs 398 Rs 487 Rs 841 Rs 752 Rs 398 Rs 398 Rs 487 Rs 841 Rs 752 Rs 398``` — , Jun 07 '20 at 11:10
Does this answer your question? [Reading input files by line using read command in shell scripting skips last line](https://stackoverflow.com/questions/17268113/reading-input-files-by-line-using-read-command-in-shell-scripting-skips-last-lin) — alecxs, Jun 07 '20 at 11:13
I have remove the sed command to see if the output differs but the out remains the same the addition is just with the html tag. I used sed to remove the html tags — , Jun 07 '20 at 11:26
@alecxs I have updated the code in my question please review. I am really stuck at this problem. The out keeps on repeating itself. — , Jun 07 '20 at 12:01
The problem is solved the there was an error in the URLs. Thanks Everyone!! — , Jun 07 '20 at 14:50

alecxs · Accepted Answer · 2020-06-07T14:32:49.497

regarding your grep | sed can't help because don't know expected output

this is example to demonstrate why URL is passed to curl without appending URI

#!/bin/bash

# just for demo
> URI.txt
URI='?daypartId=1&amp;catId='
URL=https://www.mcdelivery.com.pk/pk/browse/menu.html

# just for demo
for id in 1 11 2
  do
    echo -e "${URI}${id}" | tee -a URI.txt
    # reason why it fails
    echo -e "\n\n\n" >> URI.txt
done

ARRAY=()
while read -r LINE || [[ -n $LINE ]]
do
    ## how to prevent
    #[ "$LINE" ] && \
    ARRAY+=("$LINE")
done < URI.txt

for LINE in "${ARRAY[@]}"
  do
    # just for demo
    echo -e "LINE='$LINE'"
    # skipt empty lines
    [ "$LINE" ] && curl "${URL}${LINE}" | grep -o '<span class="starting-price">.*</span>' | sed 's/<[^>]\+>//g' >> price.txt 
done

exit 0

Get data from one file to another (Bash) - Web Scraping

.*

1 Answers1

Linked