4

I have a problem deleting documents from Amazon CloudSearch.

When I send document for deletion I receive response

{"status": "success", "adds": 0, "deletes": 5}

And then the video stays in the index with all fields reset to their default values and not deleted.

The documentation is not clear if this is the normal behaviour or a bug.

Any one else experienced this?

Lothar
  • 529
  • 1
  • 5
  • 19

2 Answers2

4

This surprised me too but appears to be normal behavior. The 'deleted' documents aren't searchable anymore since their fields are all null so they shouldn't cause any problems.

The problem I have with this is that they can be returned if you search for something like "-zomgwtfbbq", since they don't contain the term "zomgwtfbbq".

It is also confusing since it makes your dashboard show one count (the "searchable" documents) but if you run a test search for -zomgwtfbbq (what I have been using as a proxy for "get all documents"), you get a different number. Took me a while to figure out why.

Despite what they say about setting the version to max uint32 "permanently removing" the document, it will still be there. The problem is that they consider these documents unsearchable, but they're not.

alexroussos
  • 2,671
  • 1
  • 25
  • 38
  • Yes, with negative searches it is easier to retrieve these documents. We did make a workaround but the documentation should be more clear about this behaviour. – Lothar Dec 12 '13 at 10:48
  • @Lothar What was your workaround? I was thinking of adding a field named 'searchable', which I would always set to true when submitting the document, and then, behind the scenes, I would append searchable=true to all searches. But it also feels like an edge case for a user to search for _only_ the negative search term so it's probably not worth designing around. – alexroussos Dec 17 '13 at 15:52
  • for workaround, see http://stackoverflow.com/questions/14566522/how-can-i-retrieve-all-searchable-not-deleted-documents-in-amazon-cloudsearch – larham1 Jan 13 '14 at 05:32
0

Are you specifying the version number when you delete the document?

When deleting documents, note that deleting version max(uint32_t) will permanently remove the document from your domain. Because it is not possible to specify a higher version number, there is no way to add a later version of the document.

http://docs.aws.amazon.com/cloudsearch/latest/developerguide/versioning.html

E.J. Brennan
  • 45,870
  • 7
  • 88
  • 116
  • 2
    This answer seems to imply that it is advisable to set the version number to the max int, that using max int will help somehow with the issue in the original post. However, using max int does not help with this issue of getting deleted IDs returned for certain searches. Perhaps the only case where max int would be useful would be to maintain a discipline that an ID should not be reinstated after deletion (and beware of predicting the future). – larham1 Jan 13 '14 at 04:28