27

My system runs on Linux Mandriva, RDBMS - MySQL 5. I need to have the database and tables created in UTF-8.

Here is a fragment of hibernate.cfg.xml -

... 
 <property name="hibernate.hbm2ddl.auto">create-drop</property>   
 <property name="hibernate.dialect">org.hibernate.dialect.MySQLDialect</property>
 <property name="hibernate.connection.characterEncoding">utf8</property> 
...

my.cnf -

# The MySQL server
[mysqld]
...
default-character-set=cp1251
character-set-server=cp1251
collation-server=cp1251_general_ci
init-connect="SET NAMES cp1251"
skip-character-set-client-handshake
...
[mysqldump]
...    
default-character-set=cp1251
...

Some class, for example -

@Entity
@Table(name = "USER")
public class User {
    @Id 
    @Column(name = "USERID")
    @GeneratedValue(strategy = GenerationType.AUTO)
    private Integer id;

    @Column(name = "USERNAME")
    private String name;

    @Column(name = "USERPASSWORD")
    private String password;

    @Column(name = "USERIP")
    private String ip;
        // getter's and setter's here
        ...

But when the tables are generated, I see the encoding latin1 For example-

SHOW CREATE TABLE USER;

USER  | CREATE TABLE `user` (
  `USERID` int(11) NOT NULL auto_increment,
  `USERIP` varchar(255) default NULL,
  `USERNAME` varchar(255) default NULL,
  `USERPASSWORD` varchar(255) default NULL,
  PRIMARY KEY  (`USERID`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 |

How to change the encoding to UTF-8?

I would be most grateful for the information! Thank you!

...

This is strange, I have changed all to utf8 -

# The MySQL server
    [mysqld]
    ...
    default-character-set=utf8
    character-set-server=utf8
    collation-server=utf8_general_ci
    init-connect="SET NAMES utf8"
    skip-character-set-client-handshake
    ...
    [mysqldump]
    ...    
    default-character-set=utf8
    ...

And now -

SHOW CREATE TABLE USER;

USER  | CREATE TABLE `USER` (
  `USERID` int(11) NOT NULL auto_increment,
  `USERIP` varchar(255) default NULL,
  `USERNAME` varchar(255) default NULL,
  `USERPASSWORD` varchar(255) default NULL,
  PRIMARY KEY  (`USERID`)
) ENGINE=MyISAM DEFAULT CHARSET=cp1251 |
tripleee
  • 175,061
  • 34
  • 275
  • 318
  • my.cnf : default-character-set=cp1251 ? change it to utf8 ! – rkosegi Oct 25 '12 at 07:22
  • 1
    This was useful to me to find out what character set and collation I ended up with after various defaults are applied `SELECT CHARACTER_SET_NAME, COLLATION_NAME FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_SCHEMA='' AND TABLE_NAME='' AND COLUMN_NAME='';` – Emanuel George Hategan Dec 06 '18 at 10:28

9 Answers9

59

You can also create databases with encoding.
Simply use phpMyAdmin for the database/table creation.

There are some URL parameters you would specify in the URL of the hibernate settings to have the connection using UTF8:

<!-- Database Settings -->
<property name="connection.driver_class">com.mysql.jdbc.Driver</property>
<!--  for performance reasons changed to MyISAM from org.hibernate.dialect.MySQLInnoDBDialect -->
<property name="dialect">org.openmeetings.app.hibernate.utils.MySQL5MyISAMDialect</property>
<property name="connection.url">jdbc:mysql://localhost/openmeetings?autoReconnect=true&amp;useUnicode=true&amp;createDatabaseIfNotExist=true&amp;characterEncoding=utf-8</property>    

<property name="hibernate.connection.CharSet">utf8</property>
<property name="hibernate.connection.characterEncoding">utf8</property>
<property name="hibernate.connection.useUnicode">true</property>

You don't need to set the whole encoding in the database to utf8 Only if you are using

<!-- Database Scheme Auto Update -->
<property name="hbm2ddl.auto">update</property>   

You WILL have to set the default encoding of MySQL to utf8. Cause the hbm2dll will use the default encoding of the database.

You might still use hbm2ddl.auto, and modify the table's of the database manually to have utf8 collation.

If you are not using hbm2ddl.auto, you can simply create the tables with your favorite encoding. No need to set the database to a special encoding.

Sebastian

Dumbo
  • 1,630
  • 18
  • 33
seba.wagner
  • 3,800
  • 4
  • 28
  • 52
  • 1
    I should have additionally noted that I am using createDatabaseIfNotExist=true in the URL params, that means that the database will be created in default encoding. That is why you need a system wide encoding in the my.cnf. – seba.wagner Oct 25 '12 at 08:12
  • It's worth noting that using "update" for hbm2ddl.auto isn't recommended for a production environment. See http://stackoverflow.com/questions/221379/hibernate-hbm2ddl-auto-update-in-production – MPV Dec 04 '13 at 12:23
  • What to do with postgreSQL database? – prem30488 Jan 18 '16 at 07:19
  • I had the same problem and Hibernate properties did not help at all since I was using a different connection pooler for creating my db connections. In my case it was the tomcat.jdbc connection pool, so the encoding had to be set in the PoolProperties of org.apache.tomcat.jdbc.pool.DataSource – ibai Jul 12 '16 at 09:56
18

How to change the encoding to UTF-8?

I used a local dialect class that extended the MySQLDialect and changed the table-type string:

public class LocalMysqlDialect extends MySQLDialect {
    @Override
    public String getTableTypeString() {
        return " DEFAULT CHARSET=utf8";
    }
}

I was actually extending the MySQL5InnoDBDialect type so I was really using:

public class LocalMysqlDialect extends MySQL5InnoDBDialect {
    @Override
    public String getTableTypeString() {
        return " ENGINE=InnoDB DEFAULT CHARSET=utf8";
    }
}
Gray
  • 115,027
  • 24
  • 293
  • 354
  • `hibernate.connection.CharSet`, `hibernate.connection.characterEncoding` and `hibernate.connection.useUnicode` properties didn't seem to work on my side (I'm using Spring Boot 1.3.3) but extending `MySQL5InnoDBDialect` finally did the trick. – Paraita Aug 27 '18 at 07:25
15

Consider changing the connection url configuration like this:

<property name="hibernate.connection.url">
    jdbc:mysql://localhost/yourdatabase?UseUnicode=true&amp;characterEncoding=utf8
</property>

It solves the case.

Duc Tran
  • 6,016
  • 4
  • 34
  • 42
  • 1
    Saved my day! Thanks! – Jagger Apr 06 '17 at 14:32
  • 1
    I use Spring Boot, but your solution helped me, thanks. I added `characterEncoding=utf8` parameter to the `spring.datasource.url=jdbc:mysql://localhost:3306/mydatabase?useUnicode=true&characterEncoding=utf8&useJDBCCompliantTimezoneShift=true&useLegacyDatetimeCode=false&serverTimezone=UTC` line in `application.properties` file. – Yamashiro Rion Feb 19 '19 at 05:27
6

First of all on Java side you should specify UTF-8 instead of utf8, refer to table here.

Second, characterEncoding is not a character set your tables will be created in, this is just a charset that will be used while communication and reading/writing data to/from database.

MySQL Docs say that during the creation of tables, a DB charset will be used if nothing was specified in these regards. Which means that in order to make this possible, your database (not MySQL Server) should be created like that:

create database DB_NAME character set utf8;

Afterwards your tables in this database should be created in utf8 encoding. Same story with collation.

But of course you shouldn't rely on Hibernate's hbm2ddl, read here for more details.

Stanislav Bashkyrtsev
  • 14,470
  • 7
  • 42
  • 45
5

I'm using Spring-Data. I've tried activating parameters in the URL:

jdbc:mysql://localhost:3306/DATABASE?createDatabaseIfNotExist=true&amp;useUnicode=true&amp;characterEncoding=utf-8

Also, I've tried with hibernate properties, but the solution which definitively worked for me is the one proposed by @Gray

@Bean
@Autowired
public LocalContainerEntityManagerFactoryBean entityManagerFactory(DataSource dataSource) {
    HibernateJpaVendorAdapter vendorAdapter = new HibernateJpaVendorAdapter();
    vendorAdapter.setGenerateDdl(dbGenerateDdl); 
    vendorAdapter.setShowSql(dbShowSql);
    if (Arrays.asList(environment.getActiveProfiles()).contains("prod"))
        vendorAdapter.setDatabasePlatform(CustomMysqlDialect.class.getName());

    Properties jpaProperties = new Properties();
    jpaProperties.put("hibernate.connection.CharSet", "utf-8");
    jpaProperties.put("hibernate.connection.useUnicode", true);
    jpaProperties.put("hibernate.connection.characterEncoding", "utf-8");

    LocalContainerEntityManagerFactoryBean factory = new LocalContainerEntityManagerFactoryBean();
    factory.setJpaVendorAdapter(vendorAdapter);
    factory.setPackagesToScan("com.example.model");
    factory.setDataSource(dataSource);
    factory.setJpaProperties(jpaProperties);

    return factory;
}

This line saved my day:

vendorAdapter.setDatabasePlatform(CustomMysqlDialect.class.getName());
camposer
  • 5,152
  • 2
  • 17
  • 15
4

For those who use Spring Boot: add the characterEncoding=utf8 parameter to your application.properties file to this line:

spring.datasource.url=jdbc:mysql://localhost:3306/mydatabase?useUnicode=true&characterEncoding=utf8&useJDBCCompliantTimezoneShift=true&useLegacyDatetimeCode=false&serverTimezone=UTC
Yamashiro Rion
  • 1,778
  • 1
  • 20
  • 31
1

how about change database collation?

ALTER DATABASE [database] CHARACTER SET utf8 COLLATE utf8_unicode_ci;

jo8937
  • 570
  • 5
  • 11
1

Via Spring Java Config dataSource() this should help:

@Bean
public DataSource dataSource() {    
    BasicDataSource dataSource = new BasicDataSource();    
    //your username/pass props
    dataSource.setConnectionProperties("useUnicode=true;characterEncoding=utf8;characterSetResults=UTF-8;");
    return dataSource;
}

Be careful about ';' at the end of properties string!

1

You can Use Hibernate @Type attribute,Based on your requirement you can customize the annotation and apply on top of the fied. like :

  public class PhoneNumberType implements UserType {
    @Override
    public int[] sqlTypes() {
  return new int[]{Types.INTEGER, Types.INTEGER, Types.INTEGER};
    }

    @Override
    public Class returnedClass() {
        return PhoneNumber.class;
    }

    // other methods
    }   

First, the null SafeGet method:

    @Override
    public Object nullSafeGet(ResultSet rs, String[] names, 
    SharedSessionContractImplementor session, Object owner) throws HibernateException,                   
      SQLException {
    int countryCode = rs.getInt(names[0]);
     if (rs.wasNull())
        return null;
     int cityCode = rs.getInt(names[1]);
    int number = rs.getInt(names[2]);
    PhoneNumber employeeNumber = new PhoneNumber(countryCode, cityCode, number);
     return employeeNumber;
     }

Next, the null SafeSet method:

    @Override
     public void nullSafeSet(PreparedStatement st, Object value, 
      int index, SharedSessionContractImplementor session) 
       throws HibernateException, SQLException {
        if (Objects.isNull(value)) {
        st.setNull(index, Types.INTEGER);
        } else {
        PhoneNumber employeeNumber = (PhoneNumber) value;
        st.setInt(index,employeeNumber.getCountryCode());
        st.setInt(index+1,employeeNumber.getCityCode());
        st.setInt(index+2,employeeNumber.getNumber());
      }
     }

Finally, we can declare our custom PhoneNumberType in our OfficeEmployee entity class:

    @Entity
    @Table(name = "OfficeEmployee")
     public class OfficeEmployee {

      @Columns(columns = { @Column(name = "country_code"), 
      @Column(name = "city_code"), @Column(name = "number") })
      @Type(type = "com.baeldung.hibernate.customtypes.PhoneNumberType")
       private PhoneNumber employeeNumber;

     // other fields and methods
      }

This might solve your problem, This will work for all database. if you want more info refer :: https://www.baeldung.com/hibernate-custom-types similarly you have to do UTF-8 encoding/Decoding and ISO-8859-1 Decoding/encoding

TryChai
  • 51
  • 3