10

I have tried the following command in pymongo:

records= db.collection_name.find({"gender":"female"}).batch_size(5)

but after a few iterations is gives:

pymongo.errors.CursorNotFound: Cursor not found, cursor id: 61593385827.

Also if I try timeout=False in the same command i.e

records= db.collection_name.find({"gender":"female"},timeout=False).batch_size(5) 

its gives

TypeError: __init__() got an unexpected keyword argument 'timeout' error.
Remi Guan
  • 21,506
  • 17
  • 64
  • 87
Mrunmayee
  • 495
  • 3
  • 9
  • 16
  • Possible duplicate of [MongoDB - Error: getMore command failed: Cursor not found](https://stackoverflow.com/questions/44248108/mongodb-error-getmore-command-failed-cursor-not-found) – vovchisko Jun 07 '18 at 18:56

4 Answers4

10

Try setting no_cursor_timeout=True in the query, like so:

records= db.collection_name.find({"gender":"female"}, no_cursor_timeout=True).batch_size(5)

Andrew
  • 897
  • 1
  • 9
  • 19
7

Setting timeout=False is a very bad practice. A better way to get rid of the cursor id timeout exception is to estimate how many documents your loop can process within 10 minutes, and come up with an conservative batch size. This way, the MongoDB client (in this case, PyMongo) will have to query the server once in a while whenever the documents in the previous batch were used up. This will keep the cursor active on the server, and you will still be covered by the 10-minute timeout protection.

Here is how you set batch size for a cursor:

for doc in coll.find().batch_size(30):
    do_time_consuming_things()
Derek Chia
  • 405
  • 7
  • 13
1

Please show more of your code. I suspect the you cursor is just expired.

As described in mongodb manual

By default, the server will automatically close the cursor after 10 minutes of inactivity or if client has exhausted the cursor.

This means, after you created the cursor records and exhausted it by using once, for example, like

mylist = [ i for i in records]

your records cursor does not exist anymore

See also this and this questions

Community
  • 1
  • 1
lanenok
  • 2,699
  • 17
  • 24
1

2021 year now, answer is:

  • old: no_cursor_timeout=True is workable

  • new: no_cursor_timeout=True not work, should change to

    • explicitly Create Session, and All Operation (get db, get collection, find, etc.) should use that session
    • then periodically update/refresh session (to keep alive, not expired)
      • example code
import logging
from datetime import datetime
import pymongo

mongoClient = pymongo.MongoClient('mongodb://127.0.0.1:27017/your_db_name')

# every 10 minutes to update session once
#   Note: should less than 30 minutes = Mongo session defaul timeout time
#       https://docs.mongodb.com/v5.0/reference/method/cursor.noCursorTimeout/
# RefreshSessionPerSeconds = 10 * 60
RefreshSessionPerSeconds = 8 * 60

def mergeHistorResultToNewCollection():

    mongoSession = mongoClient.start_session() # <pymongo.client_session.ClientSession object at 0x1081c5c70>
    mongoSessionId = mongoSession.session_id # {'id': Binary(b'\xbf\xd8\xd...1\xbb', 4)}

    mongoDb = mongoSession.client["your_db_name"] # Database(MongoClient(host=['127.0.0.1:27017'], document_class=dict, tz_aware=False, connect=True), 'your_db_name')
    mongoCollectionOld = mongoDb["collecion_old"]
    mongoCollectionNew = mongoDb['collecion_new']

    # historyAllResultCursor = mongoCollectionOld.find(session=mongoSession)
    historyAllResultCursor = mongoCollectionOld.find(no_cursor_timeout=True, session=mongoSession)

    lastUpdateTime = datetime.now() # datetime.datetime(2021, 8, 30, 10, 57, 14, 579328)
    for curIdx, oldHistoryResult in enumerate(historyAllResultCursor):
        curTime = datetime.now() # datetime.datetime(2021, 8, 30, 10, 57, 25, 110374)
        elapsedTime = curTime - lastUpdateTime # datetime.timedelta(seconds=10, microseconds=531046)
        elapsedTimeSeconds = elapsedTime.total_seconds() # 2.65892
        isShouldUpdateSession = elapsedTimeSeconds > RefreshSessionPerSeconds
        # if (curIdx % RefreshSessionPerNum) == 0:
        if isShouldUpdateSession:
            lastUpdateTime = curTime
            cmdResp = mongoDb.command("refreshSessions", [mongoSessionId], session=mongoSession)
            logging.info("Called refreshSessions command, resp=%s", cmdResp)
        
        # do what you want

        existedNewResult = mongoCollectionNew.find_one({"shortLink": "http://xxx"}, session=mongoSession)

    # mongoSession.close()
    mongoSession.end_session()

For details, pls refer another post answer

crifan
  • 12,947
  • 1
  • 71
  • 56