8

How do I find all rows of a PostgreSQL table that contain characters in some Unicode range, such as Cyrillic characters?

Henrik N
  • 15,786
  • 5
  • 82
  • 131

2 Answers2

16

Figured it out! For Cyrillic:

SELECT * FROM "items" WHERE (title SIMILAR TO '%[\u0410-\u044f]%')

I got the range from http://symbolcodes.tlt.psu.edu/bylanguage/cyrillicchart.html. The characters have hex entities А to я, which are also my numbers above.

Henrik N
  • 15,786
  • 5
  • 82
  • 131
2

If you install the pgpcre extension, you can use this expression:

SELECT * FROM items WHERE title ~ pcre '\p{Cyrillic}';
Peter Eisentraut
  • 35,221
  • 12
  • 85
  • 90