1

I'm trying to find information on whether git's various repository URL protocols (http / https / git / ssh) all have standards for specifying subdirectories.

For example a subdirectory in HTTP(S) is like this:

full: `https://stackoverflow.com/questions/asdf`
base: `https://stackoverflow.com`
subdirectory: `/questions/asdf`

Do the other protocols all also have an equivalent? For example is this "valid"?

full: `https://github.com/Foo/bar/another`
base: `https://github.com`
subdirectory: `Foo/bar/another`
full: `git@github.com:Foo/bar/another`
base: `git@github.com`
subdirectory: `Foo/bar/another`
full: `git://github.com:Foo/bar/another`
base: `git://github.com`
subdirectory: `Foo/bar/another`

(Not sure what the real syntax for the examples above are, it's just an example of what a good answer to this question should be)

Disclaimer

This question is specifically related to whether or not the listed protocols support a subdirectory syntax. This question does not relate to whether or not these URLs are cloneable via git (there's already several questions on stackoverflow about that. This question is about what is valid URI syntax)

Also, if you have links to relevant documentation, that'd be much appreciated.

ColinKennedy
  • 828
  • 7
  • 24
  • 1
    URL syntax is [documented on Wikipedia](https://en.wikipedia.org/wiki/URL); see also [URI](https://en.wikipedia.org/wiki/Uniform_Resource_Identifier). Git uses an "extended" URL syntax: see, e.g., [the `git fetch` documentation](https://git-scm.com/docs/git-fetch). – torek Apr 25 '22 at 00:44
  • That git fetch documentation is great, thank you for sharing. Just read through it now - it looks like subdirectories aren't mentioned. Can I take it to mean they aren't supported? For example they mention `git://host.xz[:port]/path/to/repo.git/`. `path/to/repo.git/` is a directory leading to a headless git repository. However the page doesn't explain how you might express "a directory within the headless repository". – ColinKennedy Apr 25 '22 at 01:29
  • The `git://` protocol is mostly dead these days, but *all* (non-Git-extended) URLs consist of (e.g., https://) followed by some other part. The interpretation of the remaining part depends on the . For https://, the next part is the host name, with optional user name and password, and then everything after a subsequent slash is merely passed *to* the other host, which does with it as it sees fit. For ssh we have something very similar: ssh://[user@]host[:port]/rest, and the "rest" part is passed to the host's sshd, which does whatever the hell it wants with that. – torek Apr 25 '22 at 02:49
  • So, for ssh and https, you're depending on the *receiving host*. What happens with the remainder of the URL is entirely up to that host. It's pretty damned common, though, for Unix/Linux-like hosts to take the rest as a conventional path name, rooted with respect to something (the http server public dir for instance, or the user's home dir with ssh). – torek Apr 25 '22 at 02:50
  • If you want to know whether `https://host/long/path/to/dir` works, then, you *must ask the host*. The answer is not determinable without knowing what the host does. GitHub, as a hosting service, take the next part to be a user or organization, and everything after that is the repository name, which may or may not have slashes in it and may or may not be restricted however GitHub would like. – torek Apr 25 '22 at 02:52
  • 1
    "*https://stackoverflow.com/questions/asdf*" Sorry, man, that's meaningless. StackOverflow doesn't map URLs to filesystem, it maps URLs to its database. There are no directories and subdirectories, there're questions, answers and comments. The bottom line is: URL interpretation is entirely upon the host. Some hosts map URLs to filesystem but most don't — they map URLs to their databases. – phd Apr 25 '22 at 03:47
  • Thank you very much for your replies, it's clear the only ground truth I can rely on is the server and so any syntax for "subdirectories" would be invalid. I'll go a simpler, different route so I don't have to deal with these gotchas. – ColinKennedy Apr 27 '22 at 04:21

1 Answers1

0

You are only cloning a repository, not a subfolder of the repository.

You could do a minimal clone and then you git sparse-checkout the right subfolder:

#fastest clone possible:
git clone --filter=blob:none --no-checkout https://github.com/my/repo
cd repo

# Disablecone mode in .git/config.worktree
git config core.sparseCheckoutCone false

# remove .git\info\sparse-checkout
git sparse-checkout disable

# Add the expected pattern, to include just a subfolder without top files:
git sparse-checkout set /mySubFolder/

# populate working-tree with only the right files:
git read-tree -mu HEAD

But that all begin with an URL representing the full repository.

VonC
  • 1,262,500
  • 529
  • 4,410
  • 5,250