1

I have the following steps:

  1. Update record in database
  2. Add record to solr using json
  3. Commit record in database

I insert the record using updatejson call with ?commit=true But this step takes a long time. Is there a better way to keep thes in sync? The record needs to be stored in solr. I don't mind it avilable for search immediate.

broersa
  • 1,656
  • 3
  • 16
  • 31

3 Answers3

1

Commits are expensive. Do not commit after every add. You can commit for every X requests (where X depends on your latency requirements and # of writes/sec) or do a commit separately every X min. (with /update?commit=true)

arun
  • 10,685
  • 6
  • 59
  • 81
1

There are two aspects:

  • keeping your database and Solr in sync
  • making it fast

To keep it in sync reliably, you'll need to do some form of a two-phase commit. See

To do it fast, you should do it in batches as arun suggest in the other answer and as it is suggested in Solrj documentation. This is true especially if you don't need the documents available for search immediately.

You could also try to use soft commits which are less expensive than hard commits. See "commit" and "optimize" in Solr documentation. The URL would then end with update?softCommit=true. There's a nice discussion of soft and hard commits in this article: Understanding Transaction Logs, Soft Commit and Commit in SolrCloud.

Community
  • 1
  • 1
Jakub Kotowski
  • 7,411
  • 29
  • 38
1

I solved the issue by doing a ?commitWithin=15000 This persists the data, but does not merge the data with the index. It does this every 15 seconds. Enough to not block my process. Loading 100000 records goes from days to a few hours.

broersa
  • 1,656
  • 3
  • 16
  • 31