0

I have a CSV which has three columns: object-ID, image-url1, image-url2. I'd like to be able to run a bash script that does the following for each row in the CSV:

  1. create a new folder using 'object-ID' as the folder name
  2. download both images into that folder
  3. repeat for each row

I've got this code but it needs some help!

IFS=$'\n';
for file in `cat <filename.csv>`; do
echo "Creating folder $object-ID";
mkdir $object-ID
   echo "Downloading image 1";
   wget $image-url1
   echo "Downloading image 2";
   wget $image-url2
done
Neil
  • 43
  • 10
  • You need to parse each row and extract the object id and image urls. Awk would be a good start for that. http://www.joeldare.com/wiki/using_awk_on_csv_files – heldt Mar 18 '16 at 12:45

3 Answers3

0

Try this:

while IFS=, read objid url1 url2;
do
    echo "Creating folder $objid"
    mkdir -p "$objid"

    # Run in a subshell
    (
        cd "$objid"

        echo "Downloading image 1"
        wget "$url1"

        echo "Downloading image 2"
        wget "$url2";
    )

done < myfile.csv

It assumes your CSV uses comma (,) as a separator. This can be adjusted by changing the IFS=, part in the while loop.

Also, if $objid contains forward slashes (/) in it, mkdir -p will treat it as a path with subdirectories and create all of them. If that's undesirable you can replace / in $objid prior to mkdir like so:

 objid="${objid//\//_}"
dekkard
  • 6,121
  • 1
  • 16
  • 26
  • Thanks! Works great! The only issue i ran into was a %0D control character on the end of the second URL which I resolved with dos2unix via http://stackoverflow.com/questions/22236197/how-to-remove-0d-from-end-of-url-when-using-wget – Neil Mar 21 '16 at 05:13
  • Ran into a small issue - if the objid contains the / character, it starts creating folders within folders. Not sure how to escape the character so that doesn't happen. Any ideas? – Neil Apr 07 '16 at 00:14
  • Thanks - so that works but objid still stores the old value so it doesn't find the new directory containing _ within the subshell. Not quite sure how to fix. I accepted your answer btw (sorry i'm new). – Neil Apr 08 '16 at 05:17
  • No problem, just do `objid="${objid//\//_}"` prior to `mkdir` – dekkard Apr 08 '16 at 12:30
  • Thanks for your help :) – Neil Apr 11 '16 at 07:01
0

With read :

while IFS=',' read id image_one image_two; do
  [ ! -d "${id}" ] && mkdir "${id}"
  for img in ${image_one} ${image_two}; do
    printf "Downloading %s" "${img}"
    wget -P "${id}" "${img}"
    echo "---"
  done
done < file.csv

For each line : creates directory based on id value if directory doesn't exist and retrieves images in created dir (with -P option of the wget).

SLePort
  • 15,211
  • 3
  • 34
  • 44
0

With awk:

awk -F "," '{
     print "mkdir",$1"; echo wget -P",$1,$2"; echo wget -P",$1,$3
}' filename.csv | bash
jijinp
  • 2,592
  • 1
  • 13
  • 15