3

Is this a bug or am I doing something wrong (when trying to match Russian swear words in a multiplayer game chat log) on CentOS 6.5 with the stock perl 5.10.1?

# echo блядь | perl -ne 'print if /\bбля/'

# echo блядь | perl -ne 'print if /бля/'
блядь

# echo $LANG
en_US.UTF-8

Why doesn't the first command print anything?

Alexander Farber
  • 21,519
  • 75
  • 241
  • 416
  • What about `u` Unicode [modifier](http://perldoc.perl.org/perlre.html#Modifiers) `'print if /\bбля/u'` not familiar with perl, just a guess. – Jonny 5 Sep 26 '14 at 07:39
  • Unfortunately `/u` doesn't work: *Bareword found where operator expected at -e line 1, near "/\bхуй/u"* – Alexander Farber Sep 26 '14 at 07:41

1 Answers1

4

You have to tell Perl that the source code contains UTF-8 (use utf8), and that the input (-CI) and output (-CO) are UTF-8 encoded:

echo 'помёт' | perl -CIO -ne 'use utf8; print if /\bпомё/'
choroba
  • 231,213
  • 25
  • 204
  • 289