2

I need to extend a given tool with a Bash script that is supposed to work with Linux and MacOS. The script gets 2 parameters:

  1. A repository location (file system, ssh, http(s), ...)
  2. A commitish, e.g. branch, tag, commit hash

I have no influence on the parameters

The result of the script's run should be that

  • the repository is cloned to a fixed destination (always the same for one repository)
  • the repositorie's working tree should correspond to the latest state of the comittish (e.g. if it was a branch the tip of that branch)

If the repository does not (yet) exist locally, the procedure is as simple as

git clone $REPO_SOURCE $REPO_DIR
cd $REPO_DIR
git checkout $REPO_REF

My question: Consider a repository is already cloned to /repos/foo. After an obvios git fetch, how to I update that repository to the provided $REPO_REF?

  • If $REPO_REF was a branch, a git checkout $REPO_REF && git pull should work
  • If it was a commit hash, there was no update needed (just git checkout $REPO_REF?)
  • If it was a tag, then the tag might have been moved on the origin, how to handle this?
  • How to handle other edge cases?

Is there a simple reset-repository-to-this-commitsh way, so the repository behaves just as if it was freshly cloned?

Side nodes:

  • The same repository might be used with different commitish's, but only sequentially: It is guaranteed that the script isn't invoked more than once at the same time
  • All external changes to the repository might be always discarded without notification
  • While deleting and cloning the repository would work, it is impractical due to their sizes and it being an ugly solution
  • No (git) changes are needed, so checking out a detached head is okay
muffel
  • 7,004
  • 8
  • 57
  • 98

2 Answers2

2

The only totally-foolproof yet convenient way is to have the other Git (the one you might be cloning, but might not) resolve the name for you. Then you have a hash ID and a hash ID is universal.

If the name is a branch or tag name, you can use git ls-remote to achieve that step. If it might be some other formulation (e.g., master~13) you're out of luck here. So, if you need to resolve the name locally:

  • If tag discipline is obeyed, no tag will ever move. This means that if you have an existing clone that has the tag, it has the right tag, and you're OK here, and if you have an existing clone that doesn't have the tag, you can add the tag and resolve it.

  • If tag discipline is not obeyed, you'd have to delete and re-create the tags (yuck), or else re-invent remote tags: copy their refs/tags/* names to your refs/rtags/<remote>/* name-space. See Git - Checkout a remote tag when two remotes have the same tag name.

  • If you have a branch name or something relative to a branch name, turn the branch name into your own remote-tracking name (e.g., replace master~13 with refs/remotes/origin/master~13) and resolve it.

In any case, you now have a hash ID and can use detached HEAD mode.

torek
  • 448,244
  • 59
  • 642
  • 775
  • I do really like the `git ls-remote` approach, thanks! Do you see any chance to get it to work work with commit hashes as well (i.e. how to detect if the provided parameter is a valid git hash in origin to omit `git ls-remote` in favor of that *raw* value)? – muffel Mar 08 '19 at 09:00
  • Git assumes, at least for now, that anything that's *exactly* 40 characters long *and* all-hex-digits must be a raw hash ID. (I.e., even if you create a branch or tag name that has this form, the string is taken as a hash ID—even if it's not a valid one—at least for `git rev-parse`.) This might change with the switch to SHA-256, but might not. So at least for now you can run `git ls-remote`, collect the output, and match up the string, *unless* it's exactly `^[0-9a-f]{40}$` in (some) regex terms, in which case you just use it as is. – torek Mar 08 '19 at 09:13
  • thanks for the update. I should have been more precisly. Right now the use case depends on shortened (i.e. prefix) hashes as well. Is there any way to validate them with `ls-remote` or another mechanism against the remote as well? – muffel Mar 08 '19 at 11:23
  • @muffel: no. Shortened hashes are problematic: is `face2face` or `deadbeef` a hash, or a branch name? – torek Mar 08 '19 at 16:01
  • Well, I'd say it's the responsibility of the user to provide unambiguous values and consider it a hash if it's neither a tag not a branch name – muffel Mar 08 '19 at 20:36
  • [The gitrevisions documentation](https://git-scm.com/docs/gitrevisions) tells you how Git deals with the ambiguity here. Something that *resembles* a hash ID, but shortened, *is* a hash ID *if it matches just one object in the entire repository database of hash ID keys*. The `ls-remote` command doesn't spill out the full set of keys: it only gives you those hash IDs that are represented by reference names (or more precisely, some exposed subset of reference names, but usually all). – torek Mar 08 '19 at 21:13
1

Using a "standard" git clone you could to this:

# cleanup old cruft
git reset --hard HEAD
git clean -fdx

# detach from current branch (if on any)
git checkout --detach
# delete all local branches
git for-each-ref --format="%(refname:strip=2)" refs/heads |xargs -r git branch -D
# fetch and update all remote refs and tags
git fetch --force --all --tags --prune --prune-tags
# checkout
git checkout "$COMMITISH"

That way you can rely on git checkout to do its job as usual and you don't need to replicate any of its heuristics, shortcuts etc.

A.H.
  • 63,967
  • 15
  • 92
  • 126