1

I've recently started exploring SolrCloud and is trying to index documents using CloudSolrServer client. The issue I'm seeing is if I don't fire an explicit commit on CloudSolrServer object, the documents are not getting indexed. Here's my code snippet :

code>
CloudSolrServer server = new CloudSolrServer("localhost:2181"); 
server.setDefaultCollection("collection1"); 
SolrInputDocument doc = new SolrInputDocument(); 
doc.addField("id", "http://test.com/akn/test6.html"); 
doc.addField("Source2", "aknsource"); 
doc.addField("url", "http://test.com/akn/test6.html"); 
doc.addField("title", "SolrCloud rocks"); 
doc.addField("text", "This is a sample text"); 
UpdateResponse resp = server.add(doc); 
//UpdateResponse res = server.commit(); 

I've 2 shards with 1 replica each and a single zookeeper instance.

Once I run this test code, I'm able to see the request hitting the nodes. Here's the output from the log :


INFO  - 2013-09-26 03:19:04.981; 
org.apache.solr.update.processor.LogUpdateProcessor; [collection1] 
webapp=/solr path=/update params={distrib.from= 
http://ec2-1-2-3-4.us-west-1.compute.amazonaws.com:8983/solr/collection1/&update.distrib=TOLEADER&wt=javabin&version=2} 
{add=[http://test.com/akn/test6.html (1447223565945405440)]} 0 42 
INFO  - 2013-09-26 03:19:19.943; 
org.apache.solr.update.DirectUpdateHandler2; start 
commit{,optimize=false,openSearcher=false,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false} 
INFO  - 2013-09-26 03:19:20.249; org.apache.solr.core.SolrDeletionPolicy; 
SolrDeletionPolicy.onCommit: commits: num=2 

commit{dir=NRTCachingDirectory(org.apache.lucene.store.MMapDirectory@/mnt/ebs2/TestSolr44/solr/collection1/data/index 
lockFactory=org.apache.lucene.store.NativeFSLockFactory@36ddc581; 
maxCacheMB=48.0 maxMergeSizeMB=4.0),segFN=segments_7,generation=7} 

commit{dir=NRTCachingDirectory(org.apache.lucene.store.MMapDirectory@/mnt/ebs2/Testolr44/solr/collection1/data/index 
lockFactory=org.apache.lucene.store.NativeFSLockFactory@36ddc581; 
maxCacheMB=48.0 maxMergeSizeMB=4.0),segFN=segments_8,generation=8} 
INFO  - 2013-09-26 03:19:20.250; org.apache.solr.core.SolrDeletionPolicy; 
newest commit generation = 8 
INFO  - 2013-09-26 03:19:20.252; org.apache.solr.search.SolrIndexSearcher; 
Opening Searcher@c324b85 realtime 
INFO  - 2013-09-26 03:19:20.254; 
org.apache.solr.update.DirectUpdateHandler2; end_commit_flush 

From the log, it looked like that the commit has gone through successfully. But then if I query the servers, none of the entries are showing up.

Now, if I turn on


UpdateResponse res = server.commit(); 

I do the see the data indexed. Here's the log :


INFO  - 2013-09-26 03:41:24.433; 
org.apache.solr.update.processor.LogUpdateProcessor; [collection1] 
webapp=/solr path=/update params={wt=javabin&version=2} {add=[ 
http://test.com/akn/test6.html (1447224970494083072)]} 0 12 
INFO  - 2013-09-26 03:41:24.490; 
org.apache.solr.update.DirectUpdateHandler2; start 
commit{,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false} 
INFO  - 2013-09-26 03:41:24.788; org.apache.solr.core.SolrDeletionPolicy; 
SolrDeletionPolicy.onCommit: commits: num=2 

commit{dir=NRTCachingDirectory(org.apache.lucene.store.MMapDirectory@/mnt/ebs2/TestSolr44/solr/collection1/data/index lockFactory=org.apache.lucene.store.NativeFSLockFactory@36ddc581; maxCacheMB=48.0 maxMergeSizeMB=4.0),segFN=segments_8,generation=8}

commit{dir=NRTCachingDirectory(org.apache.lucene.store.MMapDirectory@/mnt/ebs2/TestSolr44/solr/collection1/data/index lockFactory=org.apache.lucene.store.NativeFSLockFactory@36ddc581; maxCacheMB=48.0 maxMergeSizeMB=4.0),segFN=segments_9,generation=9} INFO - 2013-09-26 03:41:24.788; org.apache.solr.core.SolrDeletionPolicy; newest commit generation = 9 INFO - 2013-09-26 03:41:24.792; org.apache.solr.search.SolrIndexSearcher; Opening Searcher@138ba593 main INFO - 2013-09-26 03:41:24.794; org.apache.solr.update.DirectUpdateHandler2; end_commit_flush INFO - 2013-09-26 03:41:24.794; org.apache.solr.core.QuerySenderListener; QuerySenderListener sending requests to Searcher@138ba593main{StandardDirectoryReader(segments_9:21:nrt _0(4.4):C1 _1(4.4):C1 _3(4.4):C1 _4(4.4):C1 _5(4.4):C1 _7(4.4):C1)} INFO - 2013-09-26 03:41:24.795; org.apache.solr.core.QuerySenderListener; QuerySenderListener done. INFO - 2013-09-26 03:41:24.798; org.apache.solr.core.SolrCore; [collection1] Registered new searcher Searcher@138ba593main{StandardDirectoryReader(segments_9:21:nrt _0(4.4):C1 _1(4.4):C1 _3(4.4):C1 _4(4.4):C1 _5(4.4):C1 _7(4.4):C1)} INFO - 2013-09-26 03:41:24.798; org.apache.solr.update.processor.LogUpdateProcessor; [collection1] webapp=/solr path=/update params={waitSearcher=true&commit=true&wt=javabin&expungeDeletes=false&commit_end_point=true&version=2&softCommit=false} {commit=} 0 308

Here's the commit configuration :

<autoCommit> 
<maxTime>30000</maxTime> 
<openSearcher>false</openSearcher> 
</autoCommit> 

<autoSoftCommit> <maxTime>1000</maxTime> </autoSoftCommit>

Not sure what I'm missing here, any pointers will be appreicated.

Thanks

Shamik
  • 1,671
  • 11
  • 36
  • 64

3 Answers3

0

In the first log you've included, it only shows a commit with openSearcher=false. This ensures that the data is properly flushed, but doesn't make it searchable. This is likely happening because of the autoCommit section of your config.

The config snippet you pasted shows that autoSoftCommit is also there with a maxTime of one second but the log doesn't show any soft commits. Without seeing the entire config, it's not possible to say whether that's actually active or not. It is probably commented out.

elyograg
  • 789
  • 3
  • 14
0

CloudSolrServer uses LoadBalanced HttpSolrServer which the docs specifically say not to use to issue "write" commands:

Do NOT use this class for indexing in master/slave scenarios since documents must be sent to the correct master; no inter-node routing is done.

Why don't you just make a regular HttpSolrServer pointing towards your ZooKeeper (or shard containing the ZooKeeper) and use it to insert documents (the ZooKeeper/master replica should take care of sending them down the line to the other shards) ?

Shivan Dragon
  • 15,004
  • 9
  • 62
  • 103
0

You need to configure autosoft commit and hardcommit params in solrconfig.xml file.One more thing you can do is try to use commitWithin via CloudSolrServer as its more flexible and efficient then hard commit.You shold congifure hard commit interval to arround 4-5 Min (As suites to your requirement).See below link for more details

http://wiki.apache.org/solr/CommitWithin