-1

Some weeks ago I found in this site a very useful bash script that downloads images from google image results (download images from google with command line)

Although the script is quite complicate for me, I did some simple modifications so as not to rename the results so as to keep the original names.

However, since the last week, the script stopped working... probably Google updated the code or something, and the regexes of the script don't parse the results any more. I don't know enough about google's codes, web programing or regexing to see what is wrong, although I did some educated guesses, but still didn't work.

My (unworking) tweaked script is this

#! /bin/bash

# function to create all dirs til file can be made
function mkdirs {
    file="$1"
    dir="/"

    # convert to full path
    if [ "${file##/*}" ]; then
        file="${PWD}/${file}"
    fi

    # dir name of following dir
    next="${file#/}"

    # while not filename
    while [ "${next//[^\/]/}" ]; do
        # create dir if doesn't exist
        [ -d "${dir}" ] || mkdir "${dir}"
        dir="${dir}/${next%%/*}"
        next="${next#*/}"
    done

    # last directory to make
    [ -d "${dir}" ] || mkdir "${dir}"
}

# get optional 'o' flag, this will open the image after download
getopts 'o' option
[[ $option = 'o' ]] && shift

# parse arguments
count=${1}
shift
query="$@"
[ -z "$query" ] && exit 1  # insufficient arguments

# set user agent, customize this by visiting http://whatsmyuseragent.com/
useragent='Mozilla/5.0 (X11; Ubuntu; Linux i686; rv:31.0) Gecko/20100101 Firefox/31.0'

# construct google link
link="www.google.cz/search?q=${query}\&tbm=isch"

# fetch link for download
imagelink=$(wget -e robots=off --user-agent "$useragent" -qO - "$link" | sed 's/</\n</g' | grep '<a href.*\(png\|jpg\|jpeg\)' | sed 's/.*imgurl=\([^&]*\)\&.*/\1/' | head -n $count | tail -n1)
imagelink="${imagelink%\%*}"

# get file extention (.png, .jpg, .jpeg)
ext=$(echo $imagelink | sed "s/.*\(\.[^\.]*\)$/\1/")

# set default save location and file name change this!!
dir="$PWD"
file="google image"

# get optional second argument, which defines the file name or dir
if [[ $# -eq 2 ]]; then
    if [ -d "$2" ]; then
        dir="$2"
    else
        file="${2}"
        mkdirs "${dir}"
        dir=""
    fi
fi   

# construct image link: add 'echo "${google_image}"'
# after this line for debug output
google_image="${dir}/${file}"

# construct name, append number if file exists
if [[ -e "${google_image}${ext}" ]] ; then
    i=0
    while [[ -e "${google_image}(${i})${ext}" ]] ; do
        ((i++))
    done
    google_image="${google_image}(${i})${ext}"
else
    google_image="${google_image}${ext}"
fi

# get actual picture and store in google_image.$ext
wget --max-redirect 0 -q "${imagelink}"

# if 'o' flag supplied: open image
[[ $option = "o" ]] && gnome-open "${google_image}"

# successful execution, exit code 0
exit 0
Community
  • 1
  • 1
Voprosnik
  • 195
  • 1
  • 9
  • I suggest yout to [ShellCheck](http://www.shellcheck.net/) it an update your question accordingly, if you're still having difficulties. Also debug your script and see what needs to be parsed. Try to avoid parsing XML by using REs and EREs. Use an XML parser like `xmlstarlet`, instead. – Rany Albeg Wein Apr 24 '16 at 11:25

1 Answers1

1

one way to invetigate : provide -x option to bash so to have the trace of your script; that is change /bin/bash to /bin/bash -x in your script -or- simply invoke your script with

bash -x <yourscript>

You can also annotate your script with echo commands to track some variables.

Pierre G.
  • 4,346
  • 1
  • 12
  • 25