Creating a which command in bash script

Question

For an assignment, I'm supposed to create a script called my_which.sh that will "do the same thing as the Unix command, but do it using a for loop over an if." I am also not allowed to call which in my script.

I'm brand new to this, and have been reading tutorials, but I'm pretty confused on how to start. Doesn't which just list the path name of a command?

If so, how would I go about displaying the correct path name without calling which, and while using a for loop and an if statement?

For example, if I run my script, it will echo % and wait for input. But then how do I translate that to finding the directory? So it would look like this?

#!/bin/bash
path=(`echo $PATH`)
echo -n "% "
read ans
for i in $path
do
    if [ -d $i ]; then
       echo $i
    fi
done

I would appreciate any help, or even any starting tutorials that can help me get started on this. I'm honestly very confused on how I should implement this.

See the answer for the "if". Hope that helps. – Khanna111 Feb 21 '15 at 08:32 — Khanna111, Feb 21 '15 at 08:32

gniourf_gniourf · Accepted Answer · 2015-02-21T15:50:41.057

Split your PATH variable safely. This is a general method to split a string at delimiters, that is 100% safe regarding any possible characters (including newlines):
```
IFS=: read -r -d '' -a paths < <(printf '%s:\0' "$PATH")
```
We artificially added : because if PATH ends with a trailing :, then it is understood that current directory should be in PATH. While this is dangerous and not recommended, we must also take it into account if we want to mimic which. Without this trailing colon, a PATH like /bin:/usr/bin: would be split into
```
declare -a paths='( [0]="/bin" [1]="/usr/bin" )'
```
whereas with this trailing colon the resulting array is:
```
declare -a paths='( [0]="/bin" [1]="/usr/bin" [2]="" )'
```
This is one detail that other answers miss. Of course, we'll do this only if PATH is set and non-empty.
With this split PATH, we'll use a for-loop to check whether the argument can be found in the given directory. Note that this should be done only if argument doesn't contain a / character! this is also something other answers missed.
My version of which handles a unique option -a that print all matching pathnames of each argument. Otherwise, only the first match is printed. We'll have to take this into account too.

My version of which handles the following exit status:

   0      if all specified commands are found and executable

   1      if one or more specified commands is nonexistent or not executable

   2      if an invalid option is specified

We'll handle that too.

I guess the following mimics rather faithfully the behavior of my which (and it's pure Bash):

#!/bin/bash

show_usage() {
    printf 'Usage: %s [-a] args\n' "$0"
}

illegal_option() {
    printf >&2 'Illegal option -%s\n' "$1"
    show_usage
    exit 2
}

check_arg() {
    if [[ -f $1 && -x $1 ]]; then
        printf '%s\n' "$1"
        return 0
    else
        return 1
    fi
}

# manage options

show_only_one=true

while (($#)); do
    [[ $1 = -- ]] && { shift; break; }
    [[ $1 = -?* ]] || break
    opt=${1#-}
    while [[ $opt ]]; do
        case $opt in
            (a*) show_only_one=false; opt=${opt#?} ;;
            (*) illegal_option "${opt:0:1}" ;;
        esac
    done
    shift
done

# If no arguments left or empty PATH, exit with return code 1
(($#)) || exit 1
[[ $PATH ]] || exit 1

# split path
IFS=: read -r -d '' -a paths < <(printf '%s:\0' "$PATH")

ret=0
# loop on arguments
for arg; do
    # Check whether arg contains a slash
    if [[ $arg = */* ]]; then
        check_arg "$arg" || ret=1
    else
        this_ret=1
        for p in "${paths[@]}"; do
            if check_arg "${p:-.}/$arg"; then
               this_ret=0
               "$show_only_one" && break
            fi
        done
        ((this_ret==1)) && ret=1
    fi
done

exit "$ret"

To test whether an argument is executable or not, I'm checking whether it's a regular file¹ which is executable with:

[[ -f $arg && -x $arg ]]

I guess that's close to my which's behavior.

¹ As @mklement0 points out (thanks!) the -f test, when applied against a symbolic link, tests the type of the symlink's target.

Nicely done; didn't know that about the `$PATH` variable ending in `:` - for the sake of completeness, let me add that a sequence of `::` _inside_ the `$PATH` variable has the same effect [and, as you just told me, so does a _leading_ colon] (both of which your answer also covers). `[[ ( -f $arg || -h $arg ) && -x $arg ]]` can be simplified to `[[ -f $arg && -x $arg ]]`, because when bash tests a symlink with `-f`, it helpfully tests the type of the symlink's _target_, so a symlink to a _file_ also reports true with `-f`. — mklement0, Feb 21 '15 at 16:11
My pleasure. Another point of interest is that `find`'s `-type f` does NOT work the same way: it only reports true for regular files, and, sadly, `-type l` won't let you distinguish between a symlink to a file and a directory. (I've recreated my comment to fix formatting and incorporate your leading-colon remark re `$PATH`). — mklement0, Feb 21 '15 at 16:12
You live and learn: just realized that using option `-L` with `find` _does_ make it behave the same way as bash's `-f` operator, i.e., `find -L ... -type f` finds both regular files and symlinks to files. — mklement0, Feb 22 '15 at 16:56
@mklement0 Oh, nice find! (pun intended) [and it's documented by POSIX too](http://pubs.opengroup.org/onlinepubs/009695399/utilities/find.html): _Cause the file information and file type evaluated for each symbolic link to be those of the file referenced by the link, and not the link itself._ — gniourf_gniourf, Feb 22 '15 at 17:47
One for the road: I don't think the `\0` in `printf '%s:\0' "$PATH"` is needed (it's effectively ignored) - or am I missing something? — mklement0, Feb 22 '15 at 19:03
@mklement0 you're correct, it's not needed. I just find it cleaner since with this trailling `0`, `read` is happy and returns 0, otherwise it's not happy and returns 1. That's the only reason I add a trailling `0` when I use `read -d ''`. — gniourf_gniourf, Feb 22 '15 at 19:05
Thank you for your answer, I appreciate it! I've been trying to go over bash tutorials now for the past day, attempting to understand the code you posted. I've figured out most of it (ish), but was wondering: 1. is `check_arg()` actually the one printing out to terminal the result? 2. When we test `check_arg()` what does the `"${p:-.}/$arg"` do? — Alex, Feb 23 '15 at 00:07
@Alex: 1. Yes! it's its `printf '%s\n' "$1"` line that does that. 2. If `p` is empty (this happens when there's `::` in `PATH`, or leading or trailing `:`) then `${p:-.}` expands to `.` (a single period); otherwise, `${p:-.}` expands to the expansion of `p`. You can read about it in the [Shell Parameter Expansion](http://www.gnu.org/software/bash/manual/bashref.html#Shell-Parameter-Expansion) section of the reference (it's one of the first expansions mentioned). — gniourf_gniourf, Feb 23 '15 at 00:18

rmccabe3701 · Answer 2 · 2015-02-21T06:28:42.097

1

#!/bin/bash

#Get the user's first argument to this script    
exe_name=$1

#Set the field separator to ":" (this is what the PATH variable
# uses as its delimiter), then read the contents of the PATH
# into the array variable "paths" -- at the same time splitting 
# the PATH by ":"
IFS=':' read -a paths <<< $PATH 

#Iterate over each of the paths in the "paths" array
for e in ${paths[*]}
do
    #Check for the $exe_name in this path
    find $e -name $exe_name -maxdepth 1
done

edited Feb 21 '15 at 06:28

answered Feb 21 '15 at 05:53

rmccabe3701

1,418
13
31

Thank you for the answer! For the sake of learning, am I getting this right? `exe_name=$1` sets the argument passed in as the exe name `IFS=':' read -a paths <<< $PATH` , I'm guessing we read in the path name from PATH? although I'm not sure what `<<<` does exactly.. and why we need to append ':' to it. Finally, we loop through each path and find the name of it?? Starting to get a little more confused. – Alex Feb 21 '15 at 06:23
3

You don't need the for loop: `find $paths -name $exe_name -maxdepth 1` – Diego Torres Milano Feb 21 '15 at 06:27
One more question, why would my professor tell me to use an if statement inside the for loop? What would I even be checking for in this case? – Alex Feb 21 '15 at 07:27
1

Perhaps he wants to use an if to see if the file exists and has the execute permission. – Khanna111 Feb 21 '15 at 08:13
2

A few suggestions to improve your script: **use more quotes**. Also, while `IFS=: read -a paths <<< "$PATH"` (with quotes) works 99% of the time, it's better to do a splitting like so: `IFS=: read -r -d '' -a paths < <(printf '%s\0' "$PATH")`. Your `for` loop should read `for e in "${paths[@]}"; do` (with quotes and `@` instead of `*`). Oh, and use more quotes. Your `find` could also include the `-executable` flag or, if not available `-exec test -x {} \;`. – gniourf_gniourf Feb 21 '15 at 10:35
A clean, albeit a little heavy-handed approach (invocation of `find` for each dir. in the path); marred by quoting issues, as pointed out by @gniourf_gniourf. To make the `find` command robust, you not only need to check for the executable bit, but you also need to exclude _directories_. Try `find -L "$e" -maxdepth 1 -type f -perm -u=x -name "$exe_name"`. – mklement0 Feb 22 '15 at 20:07
@dtmilano: Good point, but - quoting issues and restricting results to executable files aside - you probably meant `find "${paths[@]}" -name "$exe_name" -maxdepth 1` - just `$paths` will only expand to the array's _first_ element. – mklement0 Feb 23 '15 at 13:40

score 0 · Answer 3 · edited Feb 21 '15 at 10:40

0

This is similar to the accepted answer with the difference that it does not set the IFS and checks if the execute bits are set.

  #!/bin/bash  
  for i in $(echo "$PATH" | tr ":" "\n")
    do
        find "$i" -name "$1" -perm +111  -maxdepth 1
    done

Save this as my_which.sh (or some other name) and run it as ./my_which java etc.

However if there is an "if" required:

#!/bin/bash
for i in $(echo "$PATH" | tr ":" "\n")
do
    # this is a one liner that works. However the user requires an if statment
    # find "$i" -name "$1" -perm +111  -maxdepth 1

    cmd=$i/$1
    if [[ (  -f "$cmd"  ||  -L "$cmd" ) && -x "$cmd"  ]] 
    then
        echo "$cmd"
        break
    fi 
done

You might want to take a look at this link to figure out the tests in the "if".

edited Feb 21 '15 at 10:40

gniourf_gniourf

44,650
9
93
104

answered Feb 21 '15 at 05:52

Khanna111

3,627
1
23
25

2

I tried to fix your code by adding a few quotes here and there. Though, `for i in $(echo "$PATH" | tr : \\n)` is broken as it's subject to pathname expansion. – gniourf_gniourf Feb 21 '15 at 10:41
Assuming you're not worried about unwanted word splitting and pathname expansion, `$(echo "$PATH" | tr ":" "\n")` can be simplified to `${PATH//:/ }`. `( -f "$cmd" || -L "$cmd" ) && -x "$cmd"` can be simplified to `-f $cmd && -x $cmd`, because in the case of symlinks bash helpfully tests the type of the symlink's _target_. The `find` command would accidentally find _directories_ as well. `which` finds executables based on whether they're executable by the _current_ user, so `-perm -u=x` will do. To address these 2 issues, use `find -L "$i" -maxdepth 1 -type f -perm -u=x -name "$1"`. – mklement0 Feb 22 '15 at 19:58

score 0 · Answer 4 · edited May 23 '17 at 12:12

For a complete, rock-solid implementation, see gniourf_gniourf's answer.

Here's a more concise alternative that makes do with a single invocation of find [per name to investigate].

^{The OP later clarified that an if statement should be used in a loop, but the question is general enough to warrant considering other approaches.}

A naïve implementation would even work as a one-liner, IF you're willing to make a few assumptions (the example uses 'ls' as the executable to locate):

find -L ${PATH//:/ } -maxdepth 1 -type f -perm -u=x -name 'ls' 2>/dev/null

The assumptions - which will hold in many, but not all situations - are:

$PATH must not contain entries that when used unquoted result in shell expansions (e.g., no embedded spaces that would result in word splitting, no characters such as * that would result in pathname expansion)
$PATH must not contain an empty entry (which must be interpreted as the current dir).

Explanation:

-L tells find to investigate the targets of symlinks rather than the symlinks themselves - this ensures that symlinks to executable files are also recognized by -type f
${PATH//:/ } replaces all : chars. in $PATH with a space each, causing the result - due to being unquoted - to be passed as individual arguments split by spaces.
-maxdepth 1 instructs find to only look directly in each specified directory, not also in subdirectories
-type f matches only files, not directories.
-perm -u=x matches only files and directories that the current user (u) can execute (x).
2>/dev/null suppresses error messages that may stem from non-existent directories in the $PATH or failed attempts to access files due to lack of permission.

Here's a more robust script version:

Note:

For brevity, only handles a single argument (and no options).
Does NOT handle the case where entries or result paths may contain embedded \n chars - however, this is extremely rare in practice and likely leads to bigger problems overall.

#!//bin/bash

# Assign argument to variable; error out, if none given.
name=${1:?Please specify an executable filename.}

# Robustly read individual $PATH entries into a bash array, splitting by ':'
# - The additional trailing ':' ensures that a trailing ':' in $PATH is
#   properly recognized as an empty entry - see gniourf_gniourf's answer.
IFS=: read -r -a paths <<<"${PATH}:"

# Replace empty entries with '.' for use with `find`.
# (Empty entries imply '.' - this is legacy behavior mandated by POSIX).
for (( i = 0; i < "${#paths[@]}"; i++ )); do
  [[ "${paths[i]}" == '' ]] && paths[i]='.'
done

# Invoke `find` with *all* directories and capture the 1st match, if any, in a variable.
# Simply remove `| head -n 1` to print *all* matches.
match=$(find -L "${paths[@]}" -maxdepth 1 -type f -perm -u=x -name "$name" 2>/dev/null |
        head -n 1)

# Print result, if found, and exit with appropriate exit code.
if [[ -n $match ]]; then
  printf '%s\n' "$match"
  exit 0
else
  exit 1
fi

Creating a which command in bash script

4 Answers4

Linked