2

I am using following command to search and print non-ascii characters:

grep --color -R -C 2 -P -n "[\x80-\xFF]" .

The output that I get, prints the line which has non-ascii characters in it. However it does not print the actual unicode character.

Is there a way to print the unicode character?

output

./test.yml-35-
./test.yml-36-- name: Flush Handlers
./test.yml:37:  meta: flush_handlers
./test.yml-38-
--
Haris Farooqui
  • 944
  • 3
  • 12
  • 28

1 Answers1

2

This was answered in Searching for non-ascii characters. The real issue as shown in Filtering invalid utf8 is that the regular expression you are using is for single bytes, while UTF-8 is a multibyte encoding (and the pattern must therefore cover multiple bytes).

The extensive answer by @Peter O in the latter Q/A appears to be the best one, using Perl. grep is the wrong tool.

Community
  • 1
  • 1
Thomas Dickey
  • 51,086
  • 7
  • 70
  • 105