11

I am working on a project where we have millions of entries stored in MongoDB database and, i want to index all this data using SOLR.

After extensive Searching i came to know there are no proper "Data Import Handlers" for mongoDB database.

Can anyone tell me what are the proper approaches for indexing data in MongoDB using SOLR ?

I want to use all the features of SOLR and want it to be scalable in real-time. I saw one or two approaches from different posts but not sure how they will work real time..

Many Thanks

Community
  • 1
  • 1
kich
  • 734
  • 2
  • 9
  • 23

3 Answers3

7

10Gen introduce Mongodb Connector. You can integrate Mongodb with Solr using this tool.

Blog post : Introducing Mongo Connector

Github page : mongo-connector

Parvin Gasimzade
  • 25,180
  • 8
  • 56
  • 83
6

I have created a plugin to allow you to load data from MongoDb using the Solr data import handler.

Check it out at:

https://github.com/james75/SolrMongoImporter

user1607179
  • 61
  • 1
  • 1
5

I wrote a response to a similar question, except it was how to import data from MySQL into SOLR. The example code is in PHP, but should give you a general idea. All you would need to do is set up an iterator to step through your MongoDB assets, extract the data to SOLR datatypes, and then save it to your SOLR index.

If you want it to be real-time, you could add some custom code to the save mechanism (assuming this can be done with MongoDB), and save directly to the SOLR index, then run a commit script to commit data every 15 minutes (via cron).

Community
  • 1
  • 1
Mike Purcell
  • 19,847
  • 10
  • 52
  • 89
  • Thank you for your reply. Another interesting question that was raised when i was speaking with my roommate is to what is the best way in indexing the data in mongoDB, whether to use mongoDB indexers or Solr Indexers, which will be more efficient. We would like to have faceted Search and all other . What is your opinion on this ? – kich Feb 19 '12 at 06:38
  • 1
    MongoDB is a NoSQL solution (afaik), which means it's great for storing data such as book descriptions. And, MongoDB is a persistent store, whereas SOLR (wraps lucene), is a search engine. I would use them both, MongoDB for persistent storage, and SOLR for text searching. – Mike Purcell Feb 19 '12 at 06:42
  • Thank you, I wont be using full text search because the data that i have present in the database is more like metadata, (Ex: for a keyword search of book title, it will give the search result of book description, image, author name, stores offering the book) but i need faceted search for presenting the information in the webpage. Do you think i should mongoDB for indexing and solr for faceted Search or solr for both indexing and faceted Search ? Thank You – kich Feb 19 '12 at 06:57
  • When you say "database", you mean mongodb? Ya SOLR does facets, but you will have to write your app to "drill down" and "drill up" as users click into and out of facet searches. – Mike Purcell Feb 19 '12 at 07:06
  • Yes, i mean mongodb.. Thanks for all your answers, they are very helpful :) – kich Feb 19 '12 at 07:14
  • Dont forget a mechanism to notify of added, updated or deleted documents. A special "Table" which stores only updates will do. – Jesvin Jose Feb 19 '12 at 08:14
  • @aitchnyu: Exactly. I did this in Symfony by over-riding the save() methods. – Mike Purcell Feb 19 '12 at 18:04