57

I'm on US-English OS X 10.6.4 and try to store files with Asian characters in its name in a Git repository.

OK, let's create such a file in a Git working tree:

$ touch どうもありがとうミスターロボット.txt

Git is showing it as octal-escaped UTF-8 form:

$ git version
git version 1.7.3.1
$ git status
# On branch master
#
# Initial commit
#
# Untracked files:
#   (use "git add <file>..." to include in what will be committed)
#
#   "\343\201\250\343\202\231\343\201\206\343\202\202\343\201\202\343\202\212\343\201\213\343\202\231\343\201\250\343\201\206\343\203\237\343\202\271\343\202\277\343\203\274\343\203\255\343\203\233\343\202\231\343\203\203\343\203\210.txt"
nothing added to commit but untracked files present (use "git add" to track)

Unfortunately, I'm not able to add it to the Git repository:

$ git add どうもありがとうミスターロボット.txt
$ git status
# On branch master
#
# Initial commit
#
# Untracked files:
#   (use "git add <file>..." to include in what will be committed)
#
#   "\343\201\250\343\202\231\343\201\206\343\202\202\343\201\202\343\202\212\343\201\213\343\202\231\343\201\250\343\201\206\343\203\237\343\202\271\343\202\277\343\203\274\343\203\255\343\203\233\343\202\231\343\203\203\343\203\210.txt"
nothing added to commit but untracked files present (use "git add" to track)

Git simply ignored this file.

Using wildcards work:

$ git add *.txt
$ git status
# On branch master
#
# Initial commit
#
# Changes to be committed:
#   (use "git rm --cached <file>..." to unstage)
#
#   new file:   "\343\201\250\343\202\231\343\201\206\343\202\202\343\201\202\343\202\212\343\201\213\343\202\231\343\201\250\343\201\206\343\203\237\343\202\271\343\202\277\343\203\274\343\203\255\343\203\233\343\202\231\343\203\203\343\203\210.txt"
#

but I want to invoke the Git command from an application for a specific file name. I don't have the option to invent wildcard patterns which match exactly this file, but no one else.

Is this a known bug of Git or me not using Git correctly?

robert
  • 33,242
  • 8
  • 53
  • 74
Mot
  • 28,248
  • 23
  • 84
  • 121
  • 3
    I think this is a known bug between git and OSX : http://thread.gmane.org/gmane.comp.version-control.git/70688 – Vincent Demeester Nov 10 '10 at 12:50
  • I don't think that it is related to composed/decomposed characters like German umlauts. – Mot Nov 10 '10 at 15:54
  • @mklhmnn: Are you sure that none of the characters in your example do not have decomposed and precomposed forms? – JeremyP Nov 11 '10 at 11:50
  • 2
    Given that the first two characters in the octal string above are U+3068 HIRAGANA LETTER TO and U+3099 COMBINING KATAKANA-HIRAGANA VOICED SOUND MARK, I think Vincent is correct. – Hugh Nov 24 '10 at 22:06
  • 1
    GitX seems to handle this even tough git itself doesn't. Maybe you can poke around it's source code and see what it's doing. https://github.com/pieter/gitx – rwilliams Nov 26 '10 at 02:01
  • @mklhmnn: Have you checked out libgit2? http://libgit2.github.com/ – rwilliams Nov 26 '10 at 02:08
  • 3
    What should I do with that libgit2? I'm a Git *user*, no Git *developer*. – Mot Nov 26 '10 at 13:20
  • Are you sure the problem is with git? When you pass in the parameter *.txt, the wildcard is expanded by the shell (bash) before it is passed to git. The shell probably adds quoting, maybe quoting the name passed would help? Also, try to compare the contents of the .git directory after a successful "git add" and after an unsuccessful "git add". – Gintautas Miliauskas Nov 29 '10 at 17:04
  • @Gintautas: The shell does globbing, but doesn't add anything. Quoting happens only on the shell level in Unix, and normally no application ever receives a quoted string. If Vincent is true, it should work if you use globbing *or* if you use completion, but *not* if you type the NF-C name directly. – Philipp Jan 13 '11 at 09:30

1 Answers1

92

Git quotes any non-ascii character by default, not only asian ones. There's an option to disable this quoting behaviour.

You can disable it using the following command:

git config --global core.quotepath false

Or, alternatively, by adding the following snippet to your git config file ($HOME/.gitconfig usually)

[core]
    quotepath = false

After this, git should show your filenames exactly as they are.

As to your other problem, git not adding a file with asian characters, I can only guess that it has to do with the encoding that git uses is not the same as the encoding your terminal uses. I hope someone else can jump in and explain that bit.

Dave Vogt
  • 18,600
  • 7
  • 42
  • 54
  • In my limited testing (using Git 1.7.3.2 on Ubuntu), once I disabled `core.quotepath`, git would display the filenames as expected. Also, even with `core.quotepath` enabled, explicitly adding would update the index as expected, so the original problem may have been a bug that got fixed in newer versions of Git. – Emil Sit Dec 13 '10 at 17:00
  • Do you know whether this quoting can be disabled by setting an environment variable or passing a command line parameter to Git? – Mot Jul 30 '11 at 06:47
  • 1
    Although my problem is slightly different, an accénted character, not an asian one, but, this advice did not work for me. – John Scipione Apr 09 '13 at 03:32
  • You can also use `-c core.quotepath=false` to disable it for a single `git` invocation. For example `git -c core.quotepath=false show`. – Jeremy Kao Feb 23 '18 at 06:58