1

Here is the query in Pymongo

import mong #just my library for initializing
collection_1 = mong.init(collect="col_1")
collection_2 = mong.init(collect="col_2")

for name in collection_2.find({"field1":{"$exists":0}}):
    try:
            to_query = name['something']
            actual_id = collection_1.find_one({"something":to_query})['_id']
            crap_id = name['_id']
            collection_2.update({"_id":id},{"$set":{"new_name":actual_id}},upset=True)
    except:
            open('couldn_find_id.txt','a').write(name)

All this is doing is taking a field from one collection, finding the id of that field and updating the id of another collection. It works for about 1000-5000 iterations, but periodically fails with this and then I have to restart the script.

 > Traceback (most recent call last):
 File "my_query.py", line 6, in <module>
 for name in collection_2.find({"field1":{"$exists":0}}):
 File "/home/user/python_mods/pymongo/pymongo/cursor.py", line 814, in next
   if len(self.__data) or self._refresh():
 File "/home/user/python_mods/pymongo/pymongo/cursor.py", line 776, in _refresh
   limit, self.__id))
 File "/home/user/python_mods/pymongo/pymongo/cursor.py", line 720, in __send_message
self.__uuid_subtype)
 File "/home/user/python_mods/pymongo/pymongo/helpers.py", line 98, in _unpack_response
cursor_id)
 pymongo.errors.OperationFailure: cursor id '7578200897189065658' not valid at server
 ^C
 bye

Does anyone have any idea what this failure is, and how I can turn it into an exception to continue my script even at this failure?

Thanks

alecxe
  • 462,703
  • 120
  • 1,088
  • 1,195
jwillis0720
  • 4,329
  • 8
  • 41
  • 74

1 Answers1

1

The reason of the problem is described in pymongo's FAQ:

Cursors in MongoDB can timeout on the server if they’ve been open for a long time without any operations being performed on them. This can lead to an OperationFailure exception being raised when attempting to iterate the cursor.

This is because of the timeout argument of collection.find():

timeout (optional): if True (the default), any returned cursor is closed by the server after 10 minutes of inactivity. If set to False, the returned cursor will never time out on the server. Care should be taken to ensure that cursors with timeout turned off are properly closed.

Passing timeout=False to the find should fix the problem:

for name in collection_2.find({"field1":{"$exists":0}}, timeout=False):

But, be sure you are closing the cursor properly.

Also see:

Community
  • 1
  • 1
alecxe
  • 462,703
  • 120
  • 1,088
  • 1,195