11

We use Sunspot Solr for indexing and searching in our Ruby on Rails application.

We wanted to reindex some objects and someone accidentally ran the Product.reindex command from the Rails Console. The result was that indexing of all products started from scratch and our catalogue appeared empty while indexing was taking place.

Since we have a vast amount of data the reindexing has been taken three days so far. This morning when I checked on the progress of the reindexing, it seems like there was one corrupt data entry which resulted in the reindexing stopping without completing.

I cannot restart the entire Product.reindex operation again as it takes way too long. Is there a way to only run reindexing on selected products? I want to select a range of products that aren't indexed and then just run indexing on thise. How can I add a single product to the index without having to run a complete reindex of entire data set?

Stanley
  • 5,261
  • 9
  • 38
  • 55
  • When you say - How can I add a single product to the index without.. ", do you mean a single column/field or a subset of documents? – user1452132 Jun 29 '12 at 19:50

2 Answers2

15

Sunspot does index an object in the save callback so you could save each object but maybe that would trigger other callbacks too. A more precise way to do it would be

Sunspot.index [post1, post2]
Sunspot.commit

or with autocommit

Sunspot.index! [post1, post2]

You could even pass in object relations as they are just an array too

Sunspot.index! post1.comments
s01ipsist
  • 3,022
  • 2
  • 32
  • 36
7

I have found the answer on https://github.com/sunspot/sunspot#reindexing-objects

Whenever an object is saved, it is automatically reindexed as part of the save callbacks. So all that was needed was to add all the objects that needed reindexing to an array and then loop through the array, calling save on each object. This successfully updated the required objects in the index.

Stanley
  • 5,261
  • 9
  • 38
  • 55
  • How did you know which ones had not been indexed yet? – kidbrax Jul 04 '12 at 15:13
  • We did a few manual spot-checks. We knew that the reindex crashed sometime after doing products from 2011, so we manually checked some of our products from 2012. Then we did queries in Rails console to construct an array containing these products and saved them again, triggering the callbacks. – Stanley Jul 04 '12 at 16:06
  • 2
    If reindexing is taking this long it's possible that you're doing it naively, without taking into consideration any associations you're using in the search definitions. This is how the built-in rake task works, and it's very slow. The reindex command can take ActiveRecord includes though, allowing for far greater efficiency. I took a full index down from 15 mins to 15 seconds. Try this syntax: ```Book.solr_reindex(:batch_size => 1000, :include => [:author, {:chapters => :paragraphs}])``` Also see if you're needlessly allowing partial word searches, which really bulk up the index. – A Fader Darkly Feb 20 '15 at 10:52