Using v9.6.2 (didn't try other versions).
First one works. Second one with a Kanji word fails. What gives?
dev=> select 'foo bar' ~ '\ybar\y' v;
v
---
t
(1 row)
dev=> select '積極的 積極的' ~ '\y積極的\y' v;
v
---
f
(1 row)
Using v9.6.2 (didn't try other versions).
First one works. Second one with a Kanji word fails. What gives?
dev=> select 'foo bar' ~ '\ybar\y' v;
v
---
t
(1 row)
dev=> select '積極的 積極的' ~ '\y積極的\y' v;
v
---
f
(1 row)
It'll work without enclosing by \y
:
SELECT '積極的 積極的' ~ '積極的' AS v;
v
---
t
(1 row)
regexp_matches
will work too:
SELECT regexp_matches('積極的 積極的', '^.*(積極的).*$') AS v;
v
----------
{積極的}
(1 row)
[UPDATE]
Contemporary Chinese characters are rendered using Unicode, which not all programming platforms fully support when it comes to regex word boundaries. I suppose PostgreSQL isn't using a regex engine that supports Unicode word boundaries.
Some programming languages like Scala (Java as well) do support Unicode with word boundaries:
scala> """\b積極的\b""".r findFirstIn "積極的 積極的"
res4: Option[String] = Some(積極的)
Note that \b
, not \y
, is used for word boundaries in Scala/Java.