2

I want to find changes on my commits and the word I'm looking for has an accent: Línea.

git log -p -SLínea returns no result.

How i can escape special characters no git pickaxe option in a ANSI encoded file?

Notes:
* I'm using Git Bash for windows (Portable Git version 1.9.5)
* File is encoded in ANSI. I have tested with a file UTF-8 encoded and works.


Edit @LeGEC solution with regular expressions works: git log -S"Li.nea" --pickaxe-regex

Now I wonder if i can escape special characters or have to use regex and . every special character.

Juan Antonio Tubío
  • 1,172
  • 14
  • 28
  • Have you tried simply typing the character in your shell ? If the shell handles utf8 correctly, and the text is in utf8, this should work. Works for me (looking for `é` using bash/linux). – LeGEC Apr 24 '15 at 12:23
  • No that there are two ways how you can encode Línea in UTF-8: using the character `í` or using just plain `i` together with the combining mark. – choroba Apr 24 '15 at 12:27
  • I see. Does [this question](http://stackoverflow.com/questions/602912/how-do-you-echo-a-4-digit-unicode-character-in-bash) help ? A quick hack may be to look for `-S"Li.nea"` – LeGEC Apr 24 '15 at 12:37
  • @LeGEC,@choroba I can typing 'í' in my shell. The problem seems to be that file is ANSI encoded. @LeGeC, I guess you meant `-S"L.nea"` but -S option does not support regex expressions – Juan Antonio Tubío Apr 24 '15 at 13:20

1 Answers1

1

A quick hack is to look for "Li.nea" (i followed by any character) which can catch other words, but will roughly work.

This answer worked for me (bash / linux) to type specific Unicode points in the shell (Ctrl+Shift+U, 0, 3, 0, 1).

Tried the following with plain grep :

$ echo -e "Li\xCC\x81nea"
Línea
$ echo -e "Li\xCC\x81nea" | grep "Linea" 
# no match here
$ echo -e "Li\xCC\x81nea" | grep "Línea" # I typed "i" then "Ctrl+Shift+U, 0, 3, 0, 1"
Línea     # match
$ echo -e "Li\xCC\x81nea" | grep "Li.nea"
Línea     # match

for the regex part : git log -S"Li.nea" --pickaxe-regex


If the file is ISO-8859-1 encoded, you should probably look for the correct sequence of bytes. ISO-8859-1 for í is 0xED :

$ word=`echo -e "L\xEDnea" # ISO-8859 for Línea
$ echo $word
L�nea  # expected : the shell tries to print utf8 characters, 0xED is not correct
$ git log -S${word} #hopefully works ?
Community
  • 1
  • 1
LeGEC
  • 46,477
  • 5
  • 57
  • 104
  • Thanks, regex works with `git log -S"L.nea" --pickaxe-regex` (dot replaces i). Unicode don't works again. Probably because in ANSI files, every character is a byte and i'm findind more of a byte with Unicode. – Juan Antonio Tubío Apr 24 '15 at 13:47