9

I am new to SOLR and MONGODB.

I am trying to index data from mongodb into SOLR using DataImportHandler but I could not find the exact steps that I need to follow.

Could you please help me in getting the exact steps to index MongoDB into Solr using DataImportHandler?

SolrVersion - solr-4.6.0

MongoDB version- 2.2.7

chetna agarwal
  • 131
  • 1
  • 2
  • 5

3 Answers3

25

Late to answer, however thought people might find it useful.

Below are the steps for importing data from mongodb to Solr 4.7.0 using DataImportHandler.

Step 1:

Assume that your Mongodb has following database and collection

Database Name: Test
Collection Name: sample

The sample collection has following documents

db.sample.find()
{ "_id" : ObjectId("54c0c6666ee638a21198793b"), "Name" : "Rahul", "EmpNumber" : 452123 }
{ "_id" : ObjectId("54c0c7486ee638a21198793c"), "Name" : "Manohar", "EmpNumber" : 784521 }

Step 2:

Create a lib folder in your solrhome folder( which has bin and collection1 folders)

add below jar files to lib folder. You can download solr-mongo-importer from here!

- solr-dataimporthandler-4.7.0.jar
- solr-mongo-importer-1.0.0.jar 
- mongo-java-driver-2.10.1.jar (this is the mongo java driver)

Step 3:

Declare Solr fields in schema.xml(assumed that id is already defined by default)

add below fields in schema.xml inside the <fields> </fields> tag.

 <field name="Name" type="text_general" indexed="true" stored="true"/>
 <field name="EmployeeNumber" type="int" indexed="true" stored="true"/>

Step 4:

Declare data-config file in solrconfig.xml by adding below code inside <config> </config> tag.

<requestHandler name="/dataimport" class="org.apache.solr.handler.dataimport.DataImportHandler">  
<lst name="defaults">
<str name="config">data-config.xml</str>
</lst>
</requestHandler>

Step 5:

Create a data-config.xml file in the path collection1\conf\ (which by default holds solrconfig.xml and schema.xml)

data-config.xml

<?xml version="1.0"?>
<dataConfig>
<dataSource name="MyMongo" type="MongoDataSource" database="Test" />
<document name="import">
 <!-- if query="" then it imports everything -->
     <entity  processor="MongoEntityProcessor"
             query="{Name:'Rahul'}"
             collection="sample"   
             datasource="MyMongo"
             transformer="MongoMapperTransformer" name="sample_entity">

               <!--  If mongoField name and the field declared in schema.xml are same than no need to declare below.
                     If not same than you have to refer the mongoField to field in schema.xml
                    ( Ex: mongoField="EmpNumber" to name="EmployeeNumber"). -->                                              

           <field column="_id"  name="id"/>               
           <field column="EmpNumber" name="EmployeeNumber" mongoField="EmpNumber"/>                            
       </entity>
 </document>
</dataConfig>

Step 6:

Assuming solr (I have used port 8080) and mongodb are running, open the following link http://localhost:8080/solr/dataimport?command=full-import in your browser for importing data from mongodb to solr.

fields imported are _id,Name and EmpNumber(MongoDB) as id,Name and EmployeeNumber(Solr).

You can see the result in http://localhost:8080/solr/query?q=*

Manjunath H
  • 696
  • 7
  • 12
  • 1.if you want change your query with respect to parameter you can mention `query="{Parameter}" ` in `processor="MongoEntityProcessor"` – Manjunath H Jun 10 '15 at 10:45
  • 2.you can use the following configuration for collection in version 5.1 `solr-5.1.0\server\solr\configsets\basic_configs\conf` as `SolrHome\collection\conf` and along with this `solr-5.1.0\server\solr\solr.xml` as `SolrHome\solr.xml` – Manjunath H Jun 10 '15 at 10:56
  • I'm new to Solr and I really need to index data from MongoDB into Solr to make search queries, I tried mongo-connector but it doen't show any error but the terminal is stuck with "Logging to mongo-connector.log" so I followed your steps to the letter and when it did configure the dataimportHandler however it doesn't retrieve the data from mongoDB , I'm guessing it's something to do with "data-config.xml ", a driver to mention maybe ? could you please help me ? – Jean Jun 29 '16 at 09:52
3

You can try using SolrMongoImporter, it ask you to import 2 libraries into your solr proyect and create a data-config.xml.

You probably will need to import in your solrconfig.xml the following libraries if you don't have it

  <lib dir="../../../contrib/dataimporthandler/lib" regex=".*\.jar" />
  <lib dir="../../../dist/" regex="solr-dataimporthandler-.*\.jar" />
  • 1
    For me there are some performance issues for SolrMongoImporter on bigger mongoDB. So I tried mongoSolrImporter, which was 10x faster, because of multithreading. -> https://github.com/5missions/mongoSolrImporter – The Bndr Aug 01 '16 at 13:44
0

If you have followed everything above and still facing the issue: check whether you have multiple jars in different locations.

i.e.

  • {solr-home}/dist/solr-dataimporthandler-8.10.0.jar
  • {solr-home}/server/libs/solr-dataimporthandler-8.10.0.jar
  • {solr-home}/server/solr/core/libs/solr-dataimporthandler-8.10.0.jar

If so, remove the jar files from everywhere but the location you have configured in the solrconfig.xml file.