2

Does the JDBC Postgres Driver has a way to set the client_encoding to connect to the database?

I am using Spring (with JPA) and the connection information is in the application.properties file:

spring.datasource.url=jdbc:postgresql://mypgserver.com:5432/mydb?user=myuser&password=mypass&characterEncoding=UTF-8

But reading the JDBC docs at https://jdbc.postgresql.org/documentation/head/connect.html, I didn't find a parameter named characterEncoding.

In fact, there is no parameter for this purpose in the docs.

How can I set the encoding to be used when inputting data to the PG server?

tonyfarney
  • 300
  • 3
  • 14
  • you are reading the wrong doc ! you talk about `spring.datasource.url` which is a spring property, you should look at spring doc not pg doc. Have you look at [this Q/A](https://stackoverflow.com/questions/38677740/spring-data-jpa-utf-8-encoding-not-working) – Abdelghani Roussi Oct 19 '20 at 22:35
  • @AbdelghaniRoussi the property contains a JDBC connection. Are you sure I'm reading the wrong doc? Can you send me a link to the right one? – tonyfarney Oct 19 '20 at 23:03
  • The interpretation of a JDBC url is done by the driver, so I think you are reading the correct doc. – Jens Schauder Oct 20 '20 at 06:12
  • @AbdelghaniRoussi That question is about MySQL, not PostgreSQL. Connection properties are driver specific (except the two properties `user` and `password` defined in the JDBC API), so the OP was looking at the correct documentation. – Mark Rotteveel Oct 20 '20 at 07:25

1 Answers1

6

Since Java uses a UNICODE encoding (UTF-16) internally, it would be unnatural to use a client_encoding different from UTF8 in the PostgreSQL JDBC driver.

Consequently, it forces client_encoding to that values, see org.postgresql.core.v3.ConnectionFactoryImpl.getParametersForStartup:

private List<String[]> getParametersForStartup(String user, String database, Properties info) {
  List<String[]> paramList = new ArrayList<String[]>();
  paramList.add(new String[]{"user", user});
  paramList.add(new String[]{"database", database});
  paramList.add(new String[]{"client_encoding", "UTF8"});
  paramList.add(new String[]{"DateStyle", "ISO"});
  [...]

In fact, if the client encoding is changed to anything else, the JDBC driver expresses its unhappiness in no uncertain terms:

public void receiveParameterStatus() throws IOException, SQLException {
  // ParameterStatus
  pgStream.receiveInteger4(); // MESSAGE SIZE
  String name = pgStream.receiveString();
  String value = pgStream.receiveString();

  [...]

  if (name.equals("client_encoding")) {
    if (allowEncodingChanges) {
      if (!value.equalsIgnoreCase("UTF8") && !value.equalsIgnoreCase("UTF-8")) {
        LOGGER.log(Level.FINE,
            "pgjdbc expects client_encoding to be UTF8 for proper operation. Actual encoding is {0}",
            value);
      }
      pgStream.setEncoding(Encoding.getDatabaseEncoding(value));
    } else if (!value.equalsIgnoreCase("UTF8") && !value.equalsIgnoreCase("UTF-8")) {
      close(); // we're screwed now; we can't trust any subsequent string.
      throw new PSQLException(GT.tr(
          "The server''s client_encoding parameter was changed to {0}. The JDBC driver requires client_encoding to be UTF8 for correct operation.",
          value), PSQLState.CONNECTION_FAILURE);

    }
  }

You probably have an encoding conversion problem when you read the data into your Java program; try and fix the problem there.

Laurenz Albe
  • 209,280
  • 17
  • 206
  • 263
  • 1
    Java used to use `UCS-2` in the beginning. From 5.0, Java uses `UTF-16` as the internal encoding. – Thiyanesh Feb 24 '21 at 05:33
  • Thank you and please find the reference: `The native character encoding of the Java programming language is UTF-16. A charset in the Java platform therefore defines a mapping between sequences of sixteen-bit UTF-16 code units (that is, sequences of chars) and sequences of bytes.`[Charset](https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/nio/charset/Charset.html). – Thiyanesh Feb 24 '21 at 07:01
  • Thanks again for the good faith. Your current answer is perfectly good :-) – Thiyanesh Feb 24 '21 at 07:07