1

So I know I can put search_auto_update = False on the Model in question, however I dont want to turn off indexing entirely.

https://docs.wagtail.org/en/v2.13.5/topics/search/indexing.html#disabling-auto-update-signal-handlers-for-a-model

I have a command which bulk syncs data from an API. Currently, on every Save, it's also triggering and index; this is inefficient and slow. It would make a lot more sense to disable indexing during the sync and then bulk-index the items at the end.

Is this possible? I tried setting search_auto_update as an key/value on the model before save, but it didn't seem to do anything (it looks like it needs to be an attribute on the class, rather than a model instance value).

Nick
  • 2,803
  • 1
  • 39
  • 59
  • Do you turn off Refresh and set replica to 1 during indexation? – LeBigCat Jul 23 '22 at 08:33
  • @LeBigCat - not entirely sure what you mean here... In our case, there is no replication on the Elastic index; its just a single node. – Nick Jul 24 '22 at 01:16

1 Answers1

1

The search indexing on save is done via signals, so I think this SO answer for how to temporarily disable signals should work. In short, use FactoryBoy's mute_signals decorator

cnk
  • 981
  • 1
  • 5
  • 9
  • Oh - nice! So you can just mute specific signals... handy! `@mute_signals(signals.pre_save, signals.post_save)` – Nick Jul 24 '22 at 01:17
  • Hmm - having thought about this a bit more, wont this disable *all* post_save signals? I dont see any specific signal for Search indexing... https://github.com/wagtail/wagtail/blob/v3.0.1/wagtail/search/signal_handlers.py – Nick Jul 27 '22 at 14:39
  • 1
    Yes it looks like you would need to disable all post_save hooks. So if you need some, but not all signals to fire, this would not work. My suggestion would be to mute all during the bulk sync, and then have a final step that runs the `update_index` manage command PLUS commands to do any other processing that other post_save signals would have done. – cnk Jul 27 '22 at 18:38
  • I've ended up basically doing this, yes... I've gone with `search_auto_update = False` in the model to stop it updating on save. During sync I track the IDs which are created/updated and then I have manually replicated some of the update_index command to bulk index those items in chunks... I might look into leveraging the `update_index` command, but that would do *everything*, not just the "changed" items? AFAIK Wagtail has no "tracking" table to mark items as "stale"? – Nick Jul 28 '22 at 11:03
  • 1
    Correct. `update_index` would reindex everything. I haven't had a need for updating just changed objects but I would love to be able to reindex just one content type (the one I create as an import from elsewhere). – cnk Jul 28 '22 at 23:48