-2

I have access to a MongoDB collection but the entries their _id is a string, a url to be exact.

I want to retrieve the next document in a collection based on the previous _id. I looked around and I've seen it's possible using the ObjectID: Finding The Next Document in MongoDb.

My problem is that the database doesn't have ObjectID's in _id, there is also no field that could possibly be used as an alternative to ObjectID (f.e. a timestamp). So how would I retrieve the next document?

Edit: Added collection example

{   "_id": random.com,
    "name": "Random",
},
{   "_id": example.com,
    "name": "Example",
},
{   "_id": stack.com,
    "name": "Stack",
}

If I have the _id "random.com", how do I retrieve the next document, in this case the one with _id "example.com"? I'm using pymongo.

Community
  • 1
  • 1
Stanko
  • 4,275
  • 3
  • 23
  • 51
  • Pretty much not clear with your question. But if you want to find out the fields of one collection from another, then either you can just directly add the sub document of the collection or you can just give the reference of other collection by using `DBRef`. – Sachin Bahal Feb 27 '17 at 13:34
  • @SachinBahal I've added an example of what I try to accomplish. – Stanko Feb 27 '17 at 13:40
  • why the type of the field is a problem? still you can use `db.coll.find({_id: {"$gt": "example.com"}}).sort({ _id: 1 }).limit(1)` this is how to get the next document based on the _id index. For your case index is alphabetical and you don't get the next inserted. – thanasisp Feb 27 '17 at 14:04
  • @thanasisp I can't find a way to make that work in pymongo. – Stanko Feb 27 '17 at 14:28
  • In what sense is the document with _id 'example.com' the "next" document after the 'random.com' one? They don't follow in alphabetical order, for example. Are you relying on the order they were inserted into the collection perhaps, or the order in which they're stored on disk? – Vince Bowdren Mar 02 '17 at 15:32
  • @VinceBowdren I'm making an application where I do something with the first N documents. For example, I get the name from the first document and then I manually click a button to classify it. Then the next document shows up and I classify again and so on. When I close the application I'd like to continue from where I finished so I keep a variable, an integer that says how many documents I've done. So I actually I need the document after the N-number of documents that I've done. – Stanko Mar 02 '17 at 17:37
  • That's not quite what I meant: what I mean is, which document is the "first"? For there to be a 'first', a 'next', and so on, you must be relying on some kind of ordering of the documents - what is that ordering? – Vince Bowdren Mar 02 '17 at 17:39
  • @VinceBowdren Ah ok, they are ordered by insertion. – Stanko Mar 06 '17 at 08:58
  • 1
    @Stanko : remember that [natural order is not a reliable guide to order of insertion](http://stackoverflow.com/a/11599283/174843). By default, the documents' order on disk does reflect their order of insertion, but this can be changed by subsequent updates leading to documents being moved, by insertions into gaps left after documents have been moved, and, most drastically, by replication across nodes in a replica set. If you actually want _insertion order_, you can _not rely on natural order_, and you should instead use a field you define yourself such as an insertion timestamp. – Vince Bowdren Mar 06 '17 at 09:33

1 Answers1

1
cursor = db.coll.find({"_id": { "$gt": "the_url"}}).sort("_id").limit(1)
for doc in cursor:
    print(doc['_id'])

see sort for how you could define order. The index is alphabetical and we get next value of _id.

Still we can query for next inserted, that is about insertion time order, using sort("natural").

cursor = db.coll.find({"_id": { $gt":the_url"}}).sort("natural").limit(1)
thanasisp
  • 5,855
  • 3
  • 14
  • 31