2

We are facing high fingerprint match solr query time. Following is our setup Info:

  1. echonest/echoprint-server running on single node (solr 1.0) running on amazon ec2 instance m3.2x large box with 30G RAM & 8 cores
  2. 2.5 million tracks(segment count 19933333) ingested with solr 1.0 index size around 91G.
  3. Applied optimization HashQueryComponent.java https://github.com/playax/echoprint-server/commit/706d26362bbe9141203b2b6e7846684e7a417616#diff-f9e19e870c128c0d64915f304cf43677
  4. Also tried to capture stats of eval method, some of the loop iterations of sequential subreader of index reader took more than 1 second to iterate over all the terms.

Any suggestions or pointers in the right directions will be very helpful.

Daniel Cukier
  • 11,502
  • 15
  • 68
  • 123
nikhil
  • 53
  • 1
  • 5
  • Did you try more RAM? You may see better performance with enough RAM for the whole index. (~91G) – Daniel Cukier Jan 04 '16 at 14:02
  • One more finding, I am able to see "queryfiltercache" hit (via solr admin) second time i query same song but still all the logic inside HashQueryComponent.java runs and query time comes around 2 secs. It should directly return from cache, isn't Any idea ? I will try with more RAM and will share results soon. – nikhil Jan 04 '16 at 14:46
  • More RAM free on machine guarantee OS disk cache for the whole disk and can speedup queries. Cache inside solr is usually useless, since every fingerprint is unique. – Daniel Cukier Jan 04 '16 at 19:34
  • Tried with more RAM (120 GB) and loaded entire index into RAM via cat indexpath/* >> /dev/null but no improvement in query time. Also tried testing after integrating Lucene’s MMapDirectory into solr 1.0, still query time between 2 and 3 seconds. – nikhil Jan 05 '16 at 17:16
  • I noticed in solr logs, for fingerprint match response, hits=19402656 status=0,QTime=3147. Is it common to have a very high number of hits ? – nikhil Jan 05 '16 at 18:01
  • Yes, this is common. How it is disk I/O and CPU during queries? I just noticed that you are not using the last optimization Playax did in echoprint. Try to use the last version of this branch: https://github.com/playax/echoprint-server – Daniel Cukier Jan 05 '16 at 22:00
  • I am hitting curl based fingerprint match requests sequentially one at a time in loop via shell script. CPU usage frequently touches 95-96% across 2 cores showing around 160% on average ( 16 cores box with 120G RAM). Monitored disk I/O using iotop and iostat %iowait is around 0.69%. – nikhil Jan 06 '16 at 08:03
  • I have noticed inside eval method of HashQueryComponent.java, first itteration of indexreader loop took major computation time(>1sec). I have upload solr log of one track matching for reference. [link](https://drive.google.com/file/d/0B9bN6w5n4fptSU9QLV9FbFF2QXM/view?usp=sharing) Any idea why this is happening ? Also, we had tried setting up "github.com/playax/echoprint-server" some time back but during sanity testing, found some of tracks not matched which were matched with solr 1.0. Not sure why ? – nikhil Jan 06 '16 at 08:15
  • I don't know why some tracks did not match in solr 1.0 - we are running here perfectly with this version, witch is 3 times faster the one you are using – Daniel Cukier Jan 06 '16 at 11:50
  • We have working 3.5 solr setup now. Just wanted to know your config as per below: 1. No of segments per song indexed in solr? We have 8-9 segments per song. 2. duration of each segment? Our's is 60 seconds with 30 seconds overlap. 3. duration of search query input? Our's is 30 seconds. 4. avg Index size on a single solr machine? Our's is 92Gb consisting of ~2.5 million songs. 5. avg request completion time on a single machine? Our's is taking around 4-5 seconds. 6. your avg request load and throughput on a single solr machine?For us-60 concurrent requests for 15 min gave a throughput of 5.2 – nikhil Jan 13 '16 at 13:05
  • How many cores does your machine has? For 1,2,3 we have the same config. Our database is 700k tracks and the avg. response time is 1s. Since you database is 4x ours and the query algorithm is O(n) in track segments, I think 4s per query latency is the number. Anything better than that you will need to improve the echoprint server architecture. Throughput will depend on the concurrency level of the query server and the number of cores. With a 32 cores server, we could get ~ 12-15 req/s throughput – Daniel Cukier Jan 14 '16 at 16:43

0 Answers0