0

I'd like to detect Chinese characters in a redshift postgresql database using a SQL query.

An acceptable answer can include regex since I can use regexp_instr.

I think that this will detect non-English characters: where regexp_instr(column, '[^[:print:]]') > 0

Can I do something like that which will filter to specifically Chinese characters?

Cauder
  • 2,157
  • 4
  • 30
  • 69
  • If you have a database with Chinese characters, you can try, to see if it works.... – Luuk Jul 22 '20 at 17:51
  • I tried the code in the question and it captures all non-English characters, like cyrillic. I'd like to focus on specifically Chinese characters – Cauder Jul 22 '20 at 17:52
  • 1
    Can you please see if this pattern works for you? `regexp_instr(column, '[\u4E00-\u9FA5]') > 0` It works with `regexp_matches` in PostgreSQL. Sorry I do not have any Redshift access right now. – Mike Organek Jul 22 '20 at 18:04
  • It appears to capture Chinese characters but it's not exclusively chinese characters. I'm getting regular results like `dog` and `cat` in my output – Cauder Jul 22 '20 at 18:14
  • 1
    嘗試舉一個您想要實現的目標的例子 (try to give an example of what you will like to achieve) and Please use [edit](https://stackoverflow.com/posts/63040322/edit) to add this info to your question – Luuk Jul 22 '20 at 18:22
  • https://stackoverflow.com/questions/47553953/how-do-i-detect-chinese-characters-in-postgresql – ecp Jul 24 '20 at 10:37
  • Looks like a combo of that and this https://stackoverflow.com/questions/1366068/whats-the-complete-range-for-chinese-characters-in-unicode. Mind adding that as an answer so I can mark it correct? – Cauder Jul 24 '20 at 15:10

0 Answers0