2

Using Solr 6.2.1

I have a local MySQL database from which I want Solr to index for querying.

I created a core called create-test, and after I run ...dataimport?command=full-import, I get the following status:

<response>
<lst name="responseHeader">
    <int name="status">0</int>
    <int name="QTime">0</int>
</lst>
<lst name="initArgs">
    <lst name="defaults">
        <str name="config">osm-dih.xml</str>
    </lst>
</lst>
<str name="command">status</str>
<str name="status">idle</str>
<str name="importResponse" />
<lst name="statusMessages">
    <str name="Total Requests made to DataSource">1</str>
    <str name="Total Rows Fetched">230750</str>
    <str name="Total Documents Processed">0</str>
    <str name="Total Documents Skipped">0</str>
    <str name="Full Dump Started">2016-11-16 20:08:42</str>
    <str name="">Indexing completed. Added/Updated: 0 documents. Deleted 0 documents.</str>
    <str name="Committed">2016-11-16 20:08:44</str>
    <str name="Time taken">0:0:1.448</str>
</lst>

I am having trouble with the fact that rows were fetched, but no documents were added.

Here is my dataconfig in osm-dih.xml:

<dataConfig>
  <dataSource name="mysql"
              driver="com.mysql.jdbc.Driver"
              url="jdbc:mysql://localhost:3306/osm"
              user="osm"
              password="Start123"/>
  <document>
    <entity name="way" dataSource="mysql" query="select way_id, way_tags from osm_way">
      <field column="way_id" name="osm_id"/>
      <field column="way_tags" name="way_tags"/>
    </entity>
  </document>
</dataConfig>

Pretty basic entity.

Since the number of rows fetched in the status matches the number of rows in the database, I'm assuming that the database connection & query are working.

From my searching, I see that some people have additional attributes in a schema.xml file, but I didn't see anything like that in the DIH Solr examples or in the official Solr cwiki for DIH. I am guessing that there may be a difference between Solr versions.

Does anyone understand why rows are being fetched but no documents were added?


Edit 1 I ran verbose debugging, and here is the beginning of the raw debug response:

{
  "responseHeader": {
    "status": 0,
    "QTime": 1495
  },
  "initArgs": [
    "defaults",
    [
      "config",
      "osm-dih.xml"
    ]
  ],
  "command": "full-import",
  "mode": "debug",
  "documents": [],
  "verbose-output": [
    "entity:way",
    [
      "document#1",
      [
        "query",
        "select way_id, way_tags from osm_way",
        "time-taken",
        "0:0:1.382",
        null,
        "----------- row #1-------------",
        "way_tags",
        "{\"name\": \"Mount Royal\", \"lanes\": \"2\", \"highway\": \"tertiary\"}",
        "way_id",
        2627409,
        null,
        "---------------------------------------------"
      ],
      "document#1",
      [
        null,
        "----------- row #1-------------",
        "way_tags",
        "{\"name\": \"Longfellow\", \"lanes\": \"2\", \"highway\": \"residential\", \"surface\": \"asphalt\"}",
        "way_id",
        2627414,
        null,
        "---------------------------------------------"
      ],
...

It looks like every row is being processed as document#1, but the documents array is empty.

David Kaczynski
  • 1,246
  • 2
  • 20
  • 36
  • Please show us your scheme.xml – Oyeme Nov 17 '16 at 11:29
  • @Oyeme I found the problem, and you were definitely on the right trail with the schema. Apparently the schema information is stored in a file called `managedschema` (no xml suffix) in the version of Solr that I'm using, which was my point of confusion. I was able to add columns to my schema by adding them to `managedschema` or through the admin UI, and then my data was successfully imported. – David Kaczynski Nov 18 '16 at 20:40

0 Answers0