18

We're using Apache Solr (3.1.0) to index a lot of articles written for multiple sites. We have a master/slave setup (replication config at the bottom), where server 1 indexes the articles, and server 2 replicates the index. The slave should poll the master every 60 seconds, but instead, we can see 10 to up to 75 consecutive /replication calls nearly every time.

Each Solr core (${solr.core.name} in the slave config) represents a different site. The /replication calls I see most are tied to the biggest site. One of the cores only got 1 call per minute, and I've been able to reproduce this there after calling update?commit=true a few times, so this leads me to think it's related to the amount of commits the master performs.

So my question is, how do I stop the Solr slave from replicating the index dozens of times and force it to replicate just once per minute? I've tried playing with the commitReserveDuration parameter in the master config, but I don't really see any difference.

master replication config:

 <requestHandler name="/replication" class="solr.ReplicationHandler" >
   <lst name="master">
     <str name="replicateAfter">commit</str>
     <str name="replicateAfter">startup</str>
   </lst>
 </requestHandler>

slave replication config:

 <requestHandler name="/replication" class="solr.ReplicationHandler" >
   <lst name="slave">
     <str name="masterUrl">http://${solr.master.server}/search/${solr.core.name}/replication</str>
     <str name="pollInterval">00:00:60</str>
   </lst>
 </requestHandler>
cheffe
  • 9,345
  • 2
  • 46
  • 57
  • I would try to disable pollInterval (specify no pollInterval) and execute replication by api call triggered by a cron job. Does it help? https://wiki.apache.org/solr/SolrReplication?action=AttachFile&do=get&target=replication.png – Matthias M Mar 05 '16 at 18:06
  • Thanks for the reply. I tried this and calling `/replication?command=fetchindex` once triggers a lot of `/replication` calls on the master... I don't see any difference between this and keeping the pollInterval in the config. To be honest, this could be perfectly normal behaviour, but I just can't find any docs describing it. – Ivo van der Veeken Mar 07 '16 at 10:38
  • That was just an idea to track the problem. Sorry, I can't help you further. – Matthias M Mar 07 '16 at 11:38
  • @Gurpreet Singh Where can I check this? I haven't seen the amount of commits anywhere yet. – Ivo van der Veeken Mar 09 '16 at 04:29
  • I am facing the same issue in Solr 6.6. Replication is happening before commit on every polling time while executing full import or its auto commit not sure, but select?q*:* is returning different data till import finishs from all master and slaves, while it works fine when I disable polling. Need any clue for same. – Ruhi Singh Mar 22 '18 at 06:30
  • I left this job a few months after I asked this question and I never found the answer. Can't help you here, sorry. – Ivo van der Veeken Mar 23 '18 at 18:05
  • Is just disabling polling an option? Since you're saying it works fine then. – Ivo van der Veeken Mar 23 '18 at 18:06

1 Answers1

1

in the config you specified replication after as commit , so incase if you are issuing commit from the code very frequently then it will trigger replication , so i would suggest to change to optimize instead of commit. This should solve your problem. Here is the link which gives more details on the replicationafter settings.

Adarsh H D Dev
  • 588
  • 7
  • 29
  • Thanks for your comment. When I change commit to optimize, the slave appears to be out of sync with the master for 5+ minutes, which is way too long. Thanks though, I'll try to find a way to call the optimization from the code and see if that works. – Ivo van der Veeken Mar 21 '16 at 12:02