0
  <filter>
    <filter-name>encodingFilter</filter-name>
<filter-class>org.springframework.web.filter.CharacterEncodingFilter</filter-class>
<init-param>
        <param-name>encoding</param-name>
        <param-value>UTF-8</param-value>
    </init-param>
    <init-param>
        <param-name>forceEncoding</param-name>
        <param-value>true</param-value>
    </init-param>
</filter>

1)How can i enable UTF-16 encoding in spring filter I need to take UTF-16 or UCS2 character from a text box in JSP to Spring Controller and insert into Mysql.

2)How can I insert UTF-16 data into mysql

I am configuring JDBC URL like

"?useUnicode=yes&characterEncoding=UTF-8"

my column schema is

`utf16` varchar(150) CHARACTER SET utf16 COLLATE utf16_unicode_ci NOT NULL,
Álvaro González
  • 142,137
  • 41
  • 261
  • 360

1 Answers1

-2

I see no reason to use CHARACTER SET utf16 or ucs2. What is your rationale for such?

CHARACTER SET utf8mb4 is the standard going forward. It corresponds to the UTF-8 outside MySQL, as mentioned in "?useUnicode=yes&characterEncoding=UTF-8"

If you really need CHARACTER SET utf16 in the table, then it should work as presented. MySQL converts from the specified client encoding to the specified column charset. Yes, utf16 and utf8 are different encodings.

Rick James
  • 135,179
  • 13
  • 127
  • 222
  • 'I see no reason to use CHARACTER SET utf16 or ucs2' - sometimes you had to work with what you have – msangel Nov 16 '20 at 17:16
  • @msangel - Declare the _client_ to be using such a charset, but declare the _columns_ to be utf8. – Rick James Nov 16 '20 at 22:11
  • terrifying fact: some legacy databases do not support UTF8 – msangel Nov 17 '20 at 14:35
  • according to the https://en.wikipedia.org/wiki/UTF-16#History the ucs2 was developed in the late 1980th(originally named "unicode"), while utf16 and utf8 in the late 1990th, so almost 10 years difference – msangel Nov 17 '20 at 15:10
  • @msangel - But you don't have to keep using ucs2. As you migrate to a modern database, convert to utf8. MySQL makes that easy by letting you specify that the _input_ is encoded ucs2 and the _dataset_ is encoded utf8. – Rick James Nov 17 '20 at 17:10
  • This database is like terabytes of important historical data, too many services depend on it, and even if migration to a more modern one is a good idea, doing this requires big time and money for a company. As a programmer making such decision is not my responsibility. I only have a requirement - store some data to some database. And I just have to deal with that fact that this database is legacy. – msangel Nov 17 '20 at 17:55
  • @msangel - I guess I am lost -- Are you trying to _read_ the data? _Update_ the data? _Add_ to the dataset? Or what? – Rick James Nov 17 '20 at 18:00
  • I need to insert data to legacy database, and that database throws an exception if text fields codepoints are out of ucs2 range. ucs2 has a limit in 65k codepoints. utf-16 has 1112k codepoints. ucs2 is not a utf-16. – msangel Nov 17 '20 at 18:13
  • @msangel - Can you provide a specific character that illustrates the problem. Maybe I can find a workaround. – Rick James Nov 17 '20 at 23:14
  • workaround already found there: https://stackoverflow.com/questions/64862893/transform-utf8-string-to-ucs-2-with-replace-invalid-characters-in-java – msangel Nov 18 '20 at 05:27
  • That link shows the need for `utf8mb4` in MySQL; `utf8` will not suffice. Was the real problem about ucs2? Or about Emoji? – Rick James Jul 26 '22 at 15:49