1

I am connecting MongoDB with solr,

Following this document for integration: https://blog.toadworld.com/2017/02/03/indexing-mongodb-data-in-apache-solr

DB.Collection: solr.wlslog

D:\path to solr\bin>

mongo-connector --unique-key=id -n solr.wlslog -m localhost:27017 -t http://localhost:8983/solr/wlslog -d solr_doc_manager

I am getting below response and error:

2020-06-15 12:15:45,744 [ALWAYS] mongo_connector.connector:50 - Starting mongo-connector version: 3.1.1
2020-06-15 12:15:45,744 [ALWAYS] mongo_connector.connector:50 - Python version: 3.8.3 (tags/v3.8.3:6f8c832, May 13 2020, 22:37:02) [MSC v.1924 64 bit (AMD64)]
2020-06-15 12:15:45,745 [ALWAYS] mongo_connector.connector:50 - Platform: Windows-10-10.0.18362-SP0
2020-06-15 12:15:45,745 [ALWAYS] mongo_connector.connector:50 - pymongo version: 3.10.1
2020-06-15 12:15:45,755 [ALWAYS] mongo_connector.connector:50 - Source MongoDB version: 4.2.2
2020-06-15 12:15:45,755 [ALWAYS] mongo_connector.connector:50 - Target DocManager: mongo_connector.doc_managers.solr_doc_manager version: 0.1.0
2020-06-15 12:15:45,787 [CRITICAL] mongo_connector.oplog_manager:713 - Exception during collection dump
Traceback (most recent call last):
File "C:\Users\ancubate\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\mongo_connector\doc_managers\solr_doc_manager.py", line 292, in
batch = list(next(cleaned) for i in range(self.chunk_size))
StopIteration

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "C:\Users\ancubate\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\mongo_connector\oplog_manager.py", line 668, in do_dump
upsert_all(dm)
File "C:\Users\ancubate\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\mongo_connector\oplog_manager.py", line 651, in upsert_all
dm.bulk_upsert(docs_to_dump(from_coll), mapped_ns, long_ts)
File "C:\Users\ancubate\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\mongo_connector\util.py", line 33, in wrapped
return f(*args, **kwargs)
File "C:\Users\ancubate\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\mongo_connector\doc_managers\solr_doc_manager.py", line 292, in bulk_upsert
batch = list(next(cleaned) for i in range(self.chunk_size))
RuntimeError: generator raised StopIteration
2020-06-15 12:15:45,801 [ERROR] mongo_connector.oplog_manager:723 - OplogThread: Failed during dump collection cannot recover! Collection(Database(MongoClient(host=['localhost:27017'], document_class=dict, tz_aware=False, connect=True, replicaset='rs0'), 'local'), 'oplog.rs')
2020-06-15 12:15:46,782 [ERROR] mongo_connector.connector:408 - MongoConnector: OplogThread <OplogThread(Thread-2, started 4936)> unexpectedly stopped! Shutting down

I searched over in GitHub issues of mongo-connector but not getting any solutions:

Github-issue-870

Github-issue-898

turivishal
  • 34,368
  • 7
  • 36
  • 59

2 Answers2

1

Finally the issue is resolved :)

My system OS is windows and i have installed mongodb in C:\Program Files\MongoDB\ (system's drive),

Before this mongo-connector connection, i have initiated replica set for mongodb using below command as per this blog:

mongod --port 27017 --dbpath ../data/db --replSet rs0

Problem:

The problem inside the --dbpath ../data/db directory, this directory was located in C:\Program Files\MongoDB\Server\4.2\data\db this directory have all permissions but parent directory C:\Program Files have not all permission because its system's directory and protected directory.

Actual Problem Was: (exception during collection dump)

2020-06-15 12:15:45,787 [CRITICAL] mongo_connector.oplog_manager:713 - Exception during collection dump

Solution:

I have just changed my --dbpath to another path that directory is outside of system's protected directory as below:

mongod --port 27017 --dbpath C:/data/db --replSet rs0

After that i have executed below command for connection, as i posted in my question:

mongo-connector --unique-key=id -n solr.wlslog -m localhost:27017 -t http://localhost:8983/solr/wlslog -d solr_doc_manager

Success mongo connector log result:

2020-06-17 12:08:52,292 [ALWAYS] mongo_connector.connector:50 - Starting mongo-connector version: 3.1.1
2020-06-17 12:08:52,292 [ALWAYS] mongo_connector.connector:50 - Python version: 3.8.3 (tags/v3.8.3:6f8c832, May 13 2020, 22:37:02) [MSC v.1924 64 bit (AMD64)]
2020-06-17 12:08:52,293 [ALWAYS] mongo_connector.connector:50 - Platform: Windows-10-10.0.18362-SP0
2020-06-17 12:08:52,293 [ALWAYS] mongo_connector.connector:50 - pymongo version: 3.10.1
2020-06-17 12:08:52,310 [ALWAYS] mongo_connector.connector:50 - Source MongoDB version: 4.2.2
2020-06-17 12:08:52,311 [ALWAYS] mongo_connector.connector:50 - Target DocManager: mongo_connector.doc_managers.solr_doc_manager version: 0.1.0

Hope this answer helpful for everyone :)

Community
  • 1
  • 1
turivishal
  • 34,368
  • 7
  • 36
  • 59
1

in my case, this didn't solve the problem. I'm using python 3.8, so for me it was actually due to https://docs.python.org/3/whatsnew/3.7.html#changes-in-python-behavior

PEP 479 is enabled for all code in Python 3.7, meaning that StopIteration exceptions raised directly or indirectly in coroutines and generators are transformed into RuntimeError exceptions. (Contributed by Yury Selivanov in bpo-32670.)

reading How yield catches StopIteration exception? led me to think initially it was related to the yield doc statements but actually the problen was 2 lines calling next() in both line 292 and a few lines later in solr_doc_manager.py:

batch = list(next(cleaned) for i in range(self.chunk_size))

changed to:

        batch = []
        for i in range(self.chunk_size):
            for x in cleaned:
                batch.append(x)
jazzdup
  • 89
  • 3