1

I've made a manager for my dot files that creates sym links between each of the dot files in my dot_files/ dir and my $HOME.

I'm trying to write a script that deletes all of these specific sym links and no others.

Here's my O(n^2) solution that does the job fine.

delete_sym_links () {
    # Include dot (.) files while looping
    shopt -s dotglob

    for DOT_FILE in ~/dot_files/*;
    do
        if [ ! -d "$DOT_FILE" ];
        then
            for FILE in ~/*;
            do
                if [ -h "$FILE" ] && [ $(basename $DOT_FILE) = $(basename $FILE) ];
                then
                    rm "$FILE"
                    echo deleted "$FILE"
                fi
            done
        fi
    done
}

I'm trying to get the runtime down to O(n lg n). The bash is really tripping me up though.

Something like...

delete_sym_links () {
    SYM_LINKS=($(find $HOME -maxdepth 1 -type l -ls | sort -n))
    NUM_SYM_LINKS=$(find $HOME -maxdepth 1 -type l -ls | wc -l)
    DOT_FILES=$(find $HOME/dot_files/ -maxdepth 1 -name ".*" -type f | sort -n)
    NUM_DOT_FILES=$(find $HOME/dot_files/ -maxdepth 1 -name ".*" -type f | wc -l)

    i=0
    j=0

    while (("$i" < "$NUM_SYM_LINKS")) && (("$j" < "$NUM_DOT_FILES"));
    do
        if [ $(basename ${SYM_LINKS[$i]}) = $(basename ${DOT_FILES[$j]}) ];
        then
            echo removing sym link ${SYM_LINKS[$i]}
            rm ${SYM_LINKS[$i]}
            ((j++))
        else
            ((i++))
        fi
    done

    echo "All dot_files sym links removed"
}

h.and.h
  • 690
  • 1
  • 8
  • 26
  • 1
    Why are you looping over every file in your home directory when you know what the name should be? Essentially you could do what you were already doing in your n^2 version, but just set $FILE to `~/$(basename $DOT_FILE)`. – Tyler Marshall Apr 03 '19 at 16:11
  • Even ignoring the big-O complexity, your constant factors are going to kill you on some of this (for any significantly-sized `n`) as the code is currently written -- `$(...)` is itself a slow operation, and running external commands (like basename) pays a large penalty as well. – Charles Duffy Apr 03 '19 at 19:25
  • Re: `arr=( $(find ...) )`, see [BashPitfalls #50](http://mywiki.wooledge.org/BashPitfalls#hosts.3D.28_.24.28aws_....29_.29) -- that's not just a practice with performance issues but major correctness problems (a filename with spaces will be split into multiple array elements; a filename that can be expanded as a glob *will* be, and the results subject to `globfail` or `nullglob` behaviors if enabled; etc); better practices are discussed in [UsingFind](http://mywiki.wooledge.org/UsingFind). – Charles Duffy Apr 03 '19 at 19:27
  • 1
    ...in particular, consider `readarray -t -d '' arr < <(find ... -print0)`. – Charles Duffy Apr 03 '19 at 19:29

2 Answers2

4

Try this Shellcheck-clean, O(n), code:

function delete_symlinks
{
    local df_path df_name home_df_path

    for df_path in ~/dot_files/.* ; do
        [[ -d $df_path ]] && continue
        df_name=${df_path##*/}
        home_df_path=~/$df_name
        if [[ -L $home_df_path ]] ; then
            rm -- "$home_df_path"
            printf 'deleted %s\n' "$home_df_path"
        fi
    done

    return 0
}
pjh
  • 6,388
  • 2
  • 16
  • 17
1

bash has associative arrays. We can store a unique identifier for each dotfile and then check the home directory files for a match. I believe this is O(m+n) compared to your original O(m*n).

Here's an example using inode:

delete_sym_links () (
    declare -A inode
    shopt -s dotglob nullglob

    ls -diL ~/dot_files/* | (
        while read i name; do
            [ ! -f "$name" ] && continue
            inode2name[$i]="$name"
        done
        for name in ~/*; do
            [ ! -h "$name" ] && continue

            read i junk <<<$( ls -diL "$name" )
            if [ "$junk" != "$name" ]; then
                echo 1>&2 "ls parse error! ($name)"
            fi
            if [ -n "${inode2name[i]}" ]; then
                echo "found: $name -> ${inode2name[$i]}"
                echo "rm \"$name\""
            fi
        done
    )
)
  • As ~/dot_files/ is a subdirectory of ~/, I have assumed they are on the same filesystem so that inode can be used as the unique id.
  • Parsing ls (in the while loop) is frowned upon but is a portable way to extract the inode as long as we don't have strange filenames (eg. with embedded newlines). We only need POSIX options.
  • We call ls rather a lot in the for loop so it could have a big effect on the constant factor of the complexity, but it's only on possible matches so perhaps it's not so bad.
  • Using fn()() instead of more common fn(){} localises the shopt

If we don't like parsing ls but we have realpath, we can do:

delete_sym_links () (
    declare -A real
    shopt -s dotglob nullglob

    for name in ~/dot_files/*; do
        [ ! -f "name" ] && continue
        real="$(realpath "$name")"
        real2name["$real"]="$name"
    done

    for name in ~/*; do
        [ ! -h "$name" ] && continue
        real="$(realpath "$name")"
        if [ -n "${real2name["$real"]}" ]; then
            echo 1>&2 "found: $name -> ${real2name["$real"]}"
            echo "rm \"$name\""
        fi
    done
)
jhnc
  • 11,310
  • 1
  • 9
  • 26