-1

So I want to export my products into my new website. I have an csv file with these data:

product id,image1,image2,image3,image4,image5
1,https://img.url/img1-1.png,https://img.url/img1-2.png,https://img.url/img1-3.png,https://img.url/img1-4.png,https://img.url/img1-5.png
2,https://img.url/img2-1.png,https://img.url/img2-2.png,https://img.url/img2-3.png,https://img.url/img2-4.png,https://img.url/img2-5.png

What I want to do is to make a script to read from that file, make directory named with product id, download images of the product and put them inside their own folder (folder 1 => image1-image5 of product id 1, folder 2 => image1-image5 of product id 2, and so on).

I can make a normal text file instead of using the excel format if it's easier to do. Thanks before.

Sorry I'm really new here. I haven't done the code yet because I'm clueless, but what I want to do is something like this:

for id in $product_id; do
  mkdir $id && cd $id && curl -o $img1 $img2 $img3 $img4 $img5 && cd ..
done
tripleee
  • 175,061
  • 34
  • 275
  • 318
  • 1
    What seems to be the problem? – Martin Heralecký Jan 07 '19 at 15:37
  • Sorry, I'm new here, just edited the data to be more readable. I need a script to do something like this: mkdir $product_id && cd $product_id && curl "https://img.url/img1.png https://img.url/img2" I have around 3.000 products in that excel file. I want to automate it using a bash script. Thanks. – Budiman JoJo Jan 07 '19 at 15:41
  • Still the same question. What seems to be the problem? – Martin Heralecký Jan 07 '19 at 15:41
  • @budimanjojo Please [edit] your post and add the program you have already written. – Corion Jan 07 '19 at 15:42
  • What are you calling an excel file exactly? A `.csv`(Comma-Separated Values) which would be readable and writable from Excel would be easy to parse with bash, a binary `.xls` or zipped-XML `.xlsx` much less so. – Aaron Jan 07 '19 at 15:44
  • @Corion sorry, I'm editing it now. – Budiman JoJo Jan 07 '19 at 15:48
  • @Aaron It is a xlsx file – Budiman JoJo Jan 07 '19 at 15:49
  • You can use Excel's save as... to save it as .csv, which will make parsing it with bash easier (even than the plain-text you suggested). I suggest you do so and replace the content of your sample input in your question with the CSV-formatted data, it will make answering the question easier. – Aaron Jan 07 '19 at 15:55
  • @Corion I have edited my question. I don't really know how do I set the product_id and img1 img2 img3 img4 img5 variables yet and that's why I'm asking here. Sorry if this is not a good question. – Budiman JoJo Jan 07 '19 at 15:57
  • 1
    @BudimanJoJo don't worry about the quality of your question, most questions are badly received because they already have been asked multiple times and therefore seem uninteresting to the trained eye, but it's not something you can easily find out when you don't know how to begin with. As long as you can get unstuck that's what should matter. Once you've formatted your data as CSV, I think [this question](https://stackoverflow.com/questions/4286469/how-to-parse-a-csv-file-in-bash) will provide you all the details you need to complete your script. – Aaron Jan 07 '19 at 16:02
  • Possible duplicate of [How to parse a CSV file in Bash?](https://stackoverflow.com/questions/4286469/how-to-parse-a-csv-file-in-bash) – Aaron Jan 07 '19 at 16:03
  • @Aaron Okay seems like that's what I needed. Thanks. So what should I do with mine is something like: while IFS=, read -r product_id image1 image2 image3 image4 image5 do do my thing done That's it? – Budiman JoJo Jan 07 '19 at 16:20
  • 1
    Yeah that's it :) You might want to check how your CSV is formatted with a plain-text editor first, the CSV format is loosely defined and the separator might be commas or semi-colons, and cells might be enclosed in quotes you would need to strip. That might be adjusted when saving as CSV from Excel, but how depends on the Excel version – Aaron Jan 07 '19 at 16:22
  • @Aaron okay thank you :) – Budiman JoJo Jan 07 '19 at 16:38
  • I updated the formatting but I obviously had to speculate about the precise formatting in your CSV data. If you have quotes around some field, that would be an important detail to change. – tripleee Jan 08 '19 at 13:19
  • @tripleee the fields are separated by comma (,) Sorry I didn't notice that until now. – Budiman JoJo Jan 08 '19 at 13:30
  • That's what CSV means (though there are variants which use other delimiters, most prominently TSV which uses tabs). The question is does it say `"1","http://example.com/1.jpg"` with quotes around the fields, or no quoting? – tripleee Jan 08 '19 at 13:33

2 Answers2

0

Here is a quick and dirty attempt which should hopefully at least give you an idea of how to handle this.

#!/bin/bash

tr ',' ' ' <products.csv |
while read -r prod urls; do
     mkdir -p "$prod"
     # Potential bug: urls mustn't contain shell metacharacters
     for url in $urls; do
         wget -P "$prod" "$url"
     done
done

You could equivalently do ( cd "$prod" && curl -O "$url" ) if you prefer curl; I generally do, though the availability of an option to set the output directory with wget is convenient.

If your CSV contains quotes around the fields or you need to handle URLs which contain shell metacharacters (irregular spaces, wildcards which happen to match files in the current directory, etc; but most prominently & which means to run a shell command in the background) perhaps try something like

while IFS=, read -r prod url1 url2 url3 url4 url5; do
    mkdir -p "$prod"
    wget -P "$prod" "$url1"
    wget -P "$prod" "$url2"
    : etc
done <products.csv

which (modulo the fixed quoting) is pretty close to your attempt.

Or perhaps switch to a less wacky input format, maybe generate it on the fly from the CSV with

awk -F , 'function trim (value) {
       # Trim leading and trailing double quotes
       sub(/^"/, "", value); sub(/"$/, "", value);
       return value; }
  { prod=trim($1);
    for(i=2; i<=NF; ++i) {
        # print space-separated prod, url
        print prod, trim($i) } }' products.csv |
while read -r prod url; do
    mkdir -p "$prod"
    wget -P "$prod" "$url"
done

which splits the CSV into repeated lines with the same product ID and one URL each, and any CSV quoting removed, then just loops over that instead. mkdir with the -p option helfully doesn't mind if the directory already exists.

tripleee
  • 175,061
  • 34
  • 275
  • 318
  • This is really so much easier, thanks man. Yeah about wget or curl, after seeing the suggestions I indeed find it's easier and simpler to use wget. – Budiman JoJo Jan 08 '19 at 16:17
  • It's not that I don't want to accept the answer, but I think it's fair to give it to the first answer. I just upvoted but StackOverflow is telling me that my vote won't go public because my reputation is not enough. I do appreciate your answer though because it's easy to understand. Thanks man. – Budiman JoJo Jan 09 '19 at 17:44
  • Oh, absolutely no problem; thanks for getting back to me. – tripleee Jan 09 '19 at 18:00
-1

If you followed the good advice that @Aaron gave you, this code can help you, as you seem to be new with bash I commented out the code for better comprehension.

#!/bin/bash

# your csv file
myFile=products.csv

# number of lines of file
nLines=$(wc -l $myFile | awk '{print $1}')

echo "Total Lines=$nLines"

# loop over the lines of file
for i in `seq 1 $nLines`;
    do
        # first column value
        id=$(sed -n $(($i+1))p $myFile | awk -F ";" '{print $1}')

        line=$(sed -n $(($i+1))p $myFile)

        #create the folder if not exist
        mkdir $id 2>/dev/null

        # number of images in the line
        nImgs=$(($(echo $line | awk -F ";" '{print NF-1}')-1))

        # go to id folder
        cd $id
        #loop inside the line values
        for j in `seq 2 $nImgs`;
            do
                # getting the image url to download it
                img=$(echo $line | cut -d ";" -f $j)
                echo "Downloading image $img**";echo
                # downloading the image
                wget $img
        done 
        # go back path
        cd ..
done
Rodney Salcedo
  • 1,228
  • 19
  • 23
  • @tripleee Thank you, you're so nice. – Rodney Salcedo Jan 08 '19 at 12:55
  • Thanks man, this one totally work. I have to change the ";" to "," because my csv file is using comma as the separator. Also, the script only download 3 images of 5 images I have, is it because of '{print NF-1}' command? – Budiman JoJo Jan 08 '19 at 13:25
  • shellcheck.net is actually a useful resource, you could learn a lot from the warnings you get from your code, though it cannot guess what you are actually trying to do. If you want to throw this up on [codereview.se] and ping me here I'll be happy to provide detailed feedback. – tripleee Jan 08 '19 at 13:59
  • @BudimanJoJo, you right! Surely you file don't have a "," at the end of last fields, as you modifed the code changing the delimiter char I'm sure you can also change it for downloading all the images you want, I think you understand the code what I wrote, and you can write your own, I glad to help you. – Rodney Salcedo Jan 08 '19 at 15:53
  • Not fully understand the whole thing but I'm reading man pages of man pages of the seq, awk, cut, seq etc. Got the basic idea of what they're doing :D Thanks for your help, I marked your answer as the solution. – Budiman JoJo Jan 08 '19 at 16:10
  • @BudimanJoJo, don't worry if you're not understanding the whole code now, I'm sure you will. Keep reading the man pages, and getting advice. You're welcome to this site, I hope you can help others in the future – Rodney Salcedo Jan 08 '19 at 16:21