2

NOTE: I tried very hard to ensure that this is not a duplicate, and that the question has not been asked for bash. As far as I can tell, it is not a duplicate.

Because I'm interested in all ways of doing this, and due to the emphasis on this being a git repo in my OP, I'm not going to require that answers use only SHELL BUILTIN COMMANDS, but will give preference to answers that use only shell builtins.

The script I included in the question is my best attempt to answer the question using purely shell script..if you have a better solution that uses only shell builtins, please answer with it.

Context

I have a large bash project with a complex directory structure. Let's set aside the issue of why I chose bash in the first place, instead of something easier to implement like python.

I use git for versioning the project, and the project should be able to be distributed and installed anywhere in user space and function properly. Due to this, I have these requirements:

Requirements

  1. I would like to be able to move various scripts into different directories without destroying or altering how they function. It would be OK to have to modify how the script is executed in this case (e.g. by using the correct, updated path from other scripts), but the internals of the script should not have to be rewritten after it was moved to a new directory, in order to ensure that it keeps working.

  2. I would like each script to source important variables from a file vars.sh in the project root, so that I don't have to edit these variables in every script if I decide to change the directory structure of the project. As such, each script will need to know where the project root is.

Examples of such variables: a variable containing the absolute path of a config directory, or the absolute path of a tmp directory, etc, within the project). It would be OK to have to re-write vars.sh itself if the directory structure was changed, of course. But the idea is to only have to rewrite vars.sh (and call the scripts appropriately if they were moved).

Assumptions

  1. I'm using git, and the "project root" can be defined as git's project tree root, i.e. the directory with a .git directory.
  2. Any script shouldn't need to be told how many subdirectories down it is, within the script.

Discussion & Proposed Solution

As far as I can tell, requirement 2, that that any script source variables and/or functions, by itself necessitates the answer to this post's main question. In my experience, a source command using relative pathnames, such as

source "../../vars.sh"

generates an error "No such file or directory" in bash, and violates my Assumption 2 and Requirement 1.

Note that a variable DIR created by Finding the absolute path of a script's directory, using

DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" >/dev/null 2>&1 && pwd )"

if placed in vars.sh in the project root, will evaluate differently based on where the sourcing script is located. Thus, it is useful, but we still need to find the static absolute path of the project root.

My idea for a solution is to include a function in each script that executes a while loop with a series of cd commands (similarly to how DIR is expanded above) to progressively navigate upwards in the project until a test for the existence of directory .git provides a break in the loop, at which point the directory containing the .git directory is defined to be the project root and assigned to a variable. The function will be careful to avoid evaluating any directory outside of a subdirectory in the user's home directory:

#!/bin/bash

# a script that implements a single function to determine the root directory of
# a project. The root directory is assumed to be the one containing a .git
# directory, and is required to be in a subdirectory of $HOME.

# get the absolute path of the directory containing this script
DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" >/dev/null 2>&1 && pwd )"

find_root () {
    # find the project's root directory, based on git the existence of a .git
    # directory
    local current_dir
    current_dir=$DIR
    test_dir=".git"

    # make sure we are somewhere in a subdirectory of the user's home directory
    # if we aren't, make a suggestion and exit 
    if [[ ! "$current_dir" =~ ^${HOME}/.+ ]]; then
        echo "$(basename $0): Detected that we are not within a subdirectory of
        the home directory ${HOME}.  Please install the project containing this
        script somewhere safer, in ${HOME}.  Exiting" >&2
        exit 1
    fi

    # otherwise, continue with the directory examination
    while true; do
        # if the test directory is found, assign the root directory and stop
        # checking directories by breaking out of the loop
        if [ -d "${current_dir}/${test_dir}" ]; then
            root_dir="${current_dir}"
            echo "Found the project root: ${root_dir}"
            break

        # otherwise, continue cd'ing upwards as long as the parent directory
        # isn't the users's home directory, at which point we stop checking and
        # exit the script with an error
        else
            parent_dir="$(dirname "$current_dir")"
            if [[ "${parent_dir}" == "${HOME}" ]]; then
                echo "Couldn't find a directory ${test_dir} in a subdirectory
                of ${HOME} in order to confirm the root directory of ${0}.
                Exiting" >&2
                exit 1

            # if we can cd to the parent directory, continue the search there
            # remember to set the current directory to the parent directory;
            # simply cd'ing there doesn't have any effect, since this loops
            # evaluates the parameter current_dir, not $(pwd)
            elif cd $parent_dir; then 
                echo "Examining $parent_dir for $test_dir"
                current_dir=$parent_dir
                continue
            else
                echo "Couldn't cd to the parent directory. Exiting." >&2
                exit 1
            fi
        fi
    done

    # declare ROOT_DIR read only to prevent any subsequent assignment to it,
    # and export it for child processes
    echo "Setting ROOT_DIR to ${root_dir} as read only, and exporting ROOT_DIR"
    declare -r ROOT_DIR="$root_dir"
    export ROOT_DIR

    return
}

#execute the function
find_root

I chose not to "Answer your own question" because I would like to know your thoughts about:

  1. Pros and cons of this approach, or pitfalls if they exist
  2. Other solutions that fully answer the question and solve the problem, given the requirements
  3. Improvements to this approach

Note that, in my example above, the choice of testing for the .git directory to define the project root is arbitrary. It could easily be any file, e.g. .root_file, with custom permissions or attributes, so long as it can be evaluated by test.

Life5ign
  • 192
  • 11
  • 3
    How about [`git rev-parse --show-toplevel`](https://stackoverflow.com/a/957978/7976758) or [`git rev-parse --git-dir`](https://stackoverflow.com/a/958125/7976758)? Found in https://stackoverflow.com/search?q=%5Bgit%5D+find+root+directory – phd Sep 13 '20 at 23:38
  • @phd amazing, I didn't know `git` had that feature. That does the job pretty handily. I think this question may need clarification regarding whether the solution should be done with shell builtins or with external tools. – Life5ign Sep 13 '20 at 23:56
  • @Life5ign : Your `vars.sh` could walk upwards until it finds a directory which contains a `.git` subdirectory, and then get the absolute path of this directory (or whatever you need else for your project). – user1934428 Sep 14 '20 at 10:05
  • @user1934428 thanks, and to clarify: `vars.sh` is the file containing variables to be sourced, which exists in the project root. So it will not be walking (cd'ing upwards); rather, the scripts in subdirectories will be doing this, in order to find `vars.sh`. Also, the script I included is already doing what you suggested, after clarifying that `vars.sh` just contains the variables. – Life5ign Sep 14 '20 at 17:09

1 Answers1

5

The absolute path of the working tree of any Git repository can be found with git rev-parse --show-toplevel. This will also be a canonical path: it will have all symlinks resolved, and on Windows, it will be otherwise canonical (e.g., resolving SUBST). It will also work even when the .git directory is located elsewhere (e.g., with GIT_DIR), when in a submodule (where .git may be a file), or when in a worktree.

If you know that your project will always be in a repository, then you can simply do this:

. "$(git rev-parse --show-toplevel)/vars.sh"
bk2204
  • 64,793
  • 6
  • 84
  • 100