4

Windows is able to handle case-sensitive files by using this command on folders:

fsutil.exe file setCaseSensitiveInfo "C:\examplefolderpath" enable

However, the issue with this is that it doesn't automatically apply to subfolders... There is a way to apply it to subfolders using PowerShell, but they have to actually exist first. Thus, I'm looking for a way to retrieve a git repo's folder structure without actually downloading any of its files. Only after the folder structureis created and I've run the PowerShell command do I want to checkout the files.

Is there a convenient way to do this using git commands alone?

If not, is there a way for PowerShell to retrieve the folder structure from a bare git repository (using git clone --bare) and setup the folder structure? (For a side-project I'm working on, it would also be useful to know how to do this using Go, unless git commands alone can do it. But, this isn't as important as knowing how to do it with PowerShell.)

Vopel
  • 662
  • 6
  • 11
  • Git doesn't really "do" folders: instead, when it needs to create some file whose name is `path/to/file` it checks to see if `path/to` exists yet, and if not, makes the OS happy by creating `path/to` so that the OS can create a file named `file` in `path/to`, even though at the level of Git's index/staging-area, Git considers this a *file* named `path/to/file`, complete with forward slashes. Internally, as bk2204 notes, Git's *tree* data structures *do* record individual name components the way you'd like, so if it weren't for Git passing these through its index first, it would be easier. – torek Dec 15 '21 at 02:02
  • Having Git-for-Windows learn to do its own "set case sensitive" at the lowest level at which it *makes* a new directory would probably be a better path forward, though. It would be easy enough to add a config knob for this, and intercept the "make new directory" function. – torek Dec 15 '21 at 02:05
  • (Why is this tagged [tag:go] though?) – torek Dec 15 '21 at 02:09
  • @torek I mention in the last sentence that it'd be useful to have Go code that can do it if git commands can't, though it's less important than knowing how to do it with PowerShell. I debated with myself about adding the go tag, perhaps adding it was the wrong call. – Vopel Dec 15 '21 at 03:37

2 Answers2

2

Git doesn't provide a way to check out only the directories and not the files. You have some options, though:

  • Use Git in WSL to create the repository, which according to this article will mean that they'll automatically be made case sensitive.
  • Avoid running git checkout and find the file hierarchy with git ls-tree -rd HEAD (or whatever revision you want instead of HEAD), then generate those directories, and only then run git checkout. However, note that PowerShell pipes are known to corrupt data passed through them, so this wouldn't be a good idea when working with Git.

If you want Git for Windows to support this natively, you could go over to their issue tracker and ask for this to supported natively as a feature. I don't know how much work it would be, and I'm unable to find documentation for the API required, so it's unclear whether it could be reasonably implemented in Git.

bk2204
  • 64,793
  • 6
  • 84
  • 100
  • 3
    I do fully expect that and so will Git. It's how Unix has worked and how CMD worked, and the utility of sending raw bytes is well understood. Given the intentional decision to modify the data passing through the pipe, I think _corrupt_ is exactly the word I'm looking for. Just because the decision was intentional does not make it good. – bk2204 Dec 15 '21 at 02:22
  • 1
    Indeed, if PowerShell wants to do data conversions, it should have commands that do that: cmd1 | convert | cmd2 for instance. – torek Dec 15 '21 at 02:42
  • @torek, the linked answer also links to [GitHub issue #1908](https://github.com/PowerShell/PowerShell/issues/1908), which indirectly supports your view. I invite you to show your support / contribute to the discussion there. – mklement0 Dec 15 '21 at 02:58
  • 1
    If the pipeline *cannot* present the data it received unaltered, it's corrupting the data. It's that simple. – jthill Dec 15 '21 at 03:10
  • @bk2204 I think this problem only occurs when an external program reads from a PowerShell command's stdin. Reading from git's stdin shouldn't be an issue. – Vopel Dec 15 '21 at 05:29
  • @mklement0 Fiddilng with program output in pipelines as if it had some particular format is exactly the corruption we're talking about. It's unfortunate that there is *no* help for powershell's intrusiveness and meddling here—not to put too fine a point on it, what you're saying doesn't help anyone either. Your given is not given. Git outputs what you ask it to output. `git show` of a binary file outputs the binary file. Which PowerShell will corrupt. – jthill Dec 23 '21 at 04:54
  • @mklement0 No, one cannot. You can't even know whether `git show :.gitignore` will work in powershell, or even `git ls-files`, because file names don't all have to be in the same text encoding. – jthill Dec 23 '21 at 05:37
  • @mklement0 [here's the list of MIME charsets](https://www.iana.org/assignments/character-sets/character-sets.xhtml). Email can be a multipart message with each part in a different charset. Powershell will corrupt such messages, because powershell's meddling where it has no business. `git show some.file` shows you the file. It's not Git's business to interpret or rerender it, it's not powershell's. Git doesn't. Powershell does. – jthill Dec 23 '21 at 07:59
  • @jthill, I've cleaned up my previous comments in favor of the summary in the next comment, which hopefully paints the full picture regarding the lack of raw byte support in PowerShell's pipeline. I've also asked for the [about_pipelines](https://learn.microsoft.com/en-us/powershell/module/microsoft.powershell.core/about/about_pipelines) help topic to be amended, in [GitHub docs issue #8446](https://github.com/MicrosoftDocs/PowerShell-Docs/issues/8446). – mklement0 Dec 23 '21 at 15:57
  • PowerShell, as of v7.2, only ever communicates via _text_ with external programs in its _object_-based pipeline. External-program output is always decoded into .NET strings, line by line. Therefore, you can _not_ use the PowerShell pipeline (notably via `|` and `>`) when _raw byte data_ must be sent to / received from external programs - it will get corrupted. As a workaround, pass command lines to `cmd.exe /c` or `sh -c` that make use of _their_ `|` and `>` operators. See [this answer](https://stackoverflow.com/a/59118502/45375) for more information. – mklement0 Dec 23 '21 at 15:57
  • It's not just arbitrary data, it's even text in any encoding that isn't known in advance. I choose not to stay at hotels where the bellhops inspect and repack my luggage and damage anything they don't understand. – jthill Dec 23 '21 at 16:30
  • @jthill, text in an encoding not known in advance _is raw byte data_, by definition, and therefore covered by the summary above. (If, by contrast, you know what the encoding is, but it happens to differ from the default encoding, set `[Console]::OutputEncoding` accordingly, and it'll work fine). I think we've covered all angles now, and I am opting out of this conversation. – mklement0 Dec 23 '21 at 21:50
0

With thanks to bk2204's answer for pointing me in the right direction... after creating a bare repo with ``, I can creature the directory structure with this function:

function Invoke-GitCloneCS {
  [Alias('igccs')]
  param([Parameter(Mandatory)] $url)

  # Terminate function if user isn't admin
  if (${env:=::}) { 'Function must be ran as administrator!'; return }

  # Normalize URL so the variable ends in .git
  $url = ("$url" -replace '.git$') -replace '$','.git'

  # Derive repo name from $url
  $repo = $url.Split('/')[-1] -replace '.git$'

  git clone $url --no-checkout

  # Navigate to repo directory and apply setCaseSensitiveInfo to it
  Set-Location $repo -pv D | & { fsutil.exe file setCaseSensitiveInfo $D enable }

  # Retrieve the folder structure
  (git ls-tree --name-only -rd HEAD).ForEach({
    # Create each folder listed, and apply setCaseSensitiveInfo to it
    New-Item $_ -Type D -pv D | & { fsutil.exe file setCaseSensitiveInfo $D enable }})

  git checkout
  Set-Location ..
}

Now, that isn't the complete command, I still need to work in all the

Vopel
  • 662
  • 6
  • 11
  • It looks like you never finished your answer. Also, it's unclear how `${env:=::}` could act as a test if the function is running with elevation. – mklement0 Dec 23 '21 at 02:04
  • `Set-Location` produces no output by default, so `-pv D` won't populate variable `$D` - unless you add `-PassThru`. In general, you don't need `-pv` (`-PipelineVariable`) to simply pass output downstream in a pipeline - use [`ForEach-Object`](https://learn.microsoft.com/powershell/module/microsoft.powershell.core/foreach-object) (whose alias is `%`) instead, and refer to each input object via the [automatic `$_` variable](https://learn.microsoft.com/en-us/powershell/module/microsoft.powershell.core/about/about_Automatic_Variables#_); e.g. `'foo' | % { "[$_]" }` – mklement0 Dec 23 '21 at 02:16