2

I'm using Solr 4 and am confused about updating existing data in an index.

According to the DataImportHandler Wiki:

"delta-import : For incremental imports and change detection run the command `http://:/solr/dataimport?command=delta-import . It supports the same clean, commit, optimize and debug parameters as full-import command."

I know delta-import will find new data in the database and insert it into the index. My problem is how it handles updates where I've got a record that exists in the index and the database, the database record is changed and I want to incorporate those changes in the existing record in the index. IOW I don't want to insert it again.

I've tried this and wound up with 2 records with different key in the index. The first contains the original db values found when the index was created, the 2nd contains the db values after the record was changed. "Greetings. I have a solrj client for fetching data from database. I am using delta-import for fetching data. If a column is changed in database using timestamp with delta-import i get the latest column indexed but there are duplicate values in the index similar to the column but the data is older. This works with cleaning the index but i want to update the index without cleaning it. Is there a way to just update the index with the updated column without having duplicate values. Appreciate for any feedback.

Phuc Thai
  • 718
  • 7
  • 17
Farid CH
  • 21
  • 3
  • The document that represent must have a unique key. Is there any in your case? any id which is unique? whats the in the schema.xml? – Abhijit Bashetti Apr 28 '15 at 07:53
  • Yes i have a unique id. In my shema.xml i have id – Farid CH Apr 28 '15 at 08:08
  • The problem is when a column is changed in database using timestamp with delta-import i get two records with different unique id because unique id is automatically generated – Farid CH Apr 28 '15 at 08:16
  • Then you have chosen a bad primary key. The one you have picked is a technical one. You need a key that reflects the identity of a record. The thing that may not be unique in terms of your RDBMS, but the thing to find the record. – cheffe Apr 28 '15 at 08:53
  • 1
    Have a look here http://stackoverflow.com/questions/1945752/solr-dih-delta-import-with-compound-primary-keys – cheffe Apr 28 '15 at 08:53
  • I have multiple tables with identical schema - table_a, table_b, table_c ... and I am trying to create one big index with the data from each of these tables in one data-config.xml file so i can not use a primary key in my RDBMS as unique id. – Farid CH Apr 28 '15 at 09:03

0 Answers0