1

My data-config.xml look like this. The file encoding is UTF-8 with BOM.

<?xml version="1.0" encoding="UTF-8"?>
<dataConfig>
  <dataSource type="JdbcDataSource" 
              driver="com.mysql.jdbc.Driver"
              url="jdbc:mysql://192.168.0.2/dasaran_old" 
              user="root" 
              password=""
              encoding="UTF-8"/>
  <document>
    <entity name="user" 
            query="SELECT CONCAT_WS('_', 1, u.`id`) AS id, u.`id` AS entity_id, 1 AS entity_type, fullname AS title, CONCAT_WS(' ', 'Դպրոց՝ ', s.title, 'դաս.՝', cl.title) AS description FROM das_user u INNER JOIN das_ref_student_to_class_to_school sts ON u.id = sts.student_id INNER JOIN das_school s ON sts.school_id = s.id INNER JOIN das_classes cl ON sts.class_id = cl.id WHERE u.role = 'student'">
    </entity>
  </document>
</dataConfig>

The unicode data extracted from MySQL is OK. But the unicode characters in query are not being inserted to Solr index as Unicode.

I'm getting document like this.

<doc>
<str name="description">?????? 65 ???.? 5-1</str>
<int name="entity_id">18126</int>
<int name="entity_type">1</int>
<str name="general">Ռուբեն Վարդանյան Արմենի</str>
<str name="id">[B@1bc6e3ce</str>
<str name="title">Ռուբեն Վարդանյան Արմենի</str>
</doc>
Tigran Tokmajyan
  • 1,937
  • 7
  • 25
  • 36

1 Answers1

0

I don't think the encoding on the data-config.xml has anything to do with encoding used on jdbc connection, you should specify that as a url parameter, see details in this question

Parameter is:

jdbc:mysql://localhost:3306/administer?characterEncoding=utf8
Community
  • 1
  • 1
Persimmonium
  • 15,593
  • 11
  • 47
  • 78