0

I am using rsolr gem to integrate solr search with my RoR app. Now for each search, I need to specify the rows parameter, which is the number of results I want to retrieve. In order to retrieve all results corresponding to a query, I set the rows parameter to a high value as mentioned in this post.

But doing that makes the processing really really slow and I am getting the following error in the rails logs:

[2014-01-11 15:51:08] ERROR WEBrick::HTTPStatus::RequestURITooLarge
[2014-01-11 15:51:08] ERROR TypeError: can't convert nil into an exact number
    /home/nish/.rvm/gems/ruby-1.9.2-p320@voylla/gems/activesupport-3.1.10/lib/active_support/core_ext/time/calculations.rb:266:in `-'
    /home/nish/.rvm/gems/ruby-1.9.2-p320@voylla/gems/activesupport-3.1.10/lib/active_support/core_ext/time/calculations.rb:266:in `minus_with_duration'
    /home/nish/.rvm/gems/ruby-1.9.2-p320@voylla/gems/activesupport-3.1.10/lib/active_support/core_ext/time/calculations.rb:277:in `minus_with_coercion'
    /home/nish/.rvm/rubies/ruby-1.9.2-p320/lib/ruby/1.9.1/webrick/accesslog.rb:42:in `setup_params'
    /home/nish/.rvm/rubies/ruby-1.9.2-p320/lib/ruby/1.9.1/webrick/httpserver.rb:164:in `access_log'
    /home/nish/.rvm/rubies/ruby-1.9.2-p320/lib/ruby/1.9.1/webrick/httpserver.rb:87:in `run'
    /home/nish/.rvm/rubies/ruby-1.9.2-p320/lib/ruby/1.9.1/webrick/server.rb:183:in `block in start_thread'

How can I fix this issue? Thanks

Community
  • 1
  • 1
nish
  • 6,952
  • 18
  • 74
  • 128

2 Answers2

0

From the Solr FAQ:

This is impractical in most cases. People typically only want to do this when they know they are dealing with an index whose size guarantees the result sets will be always be small enough that they can feasibly be transmitted in a manageable amount -- but if that's the case just specify what you consider a "manageable amount" as your rows param and get the best of both worlds (all the results when your assumption is right, and a sanity cap on the result size if it turns out your assumptions are wrong)

Brad Larson
  • 170,088
  • 45
  • 397
  • 571
Arun
  • 1,777
  • 10
  • 11
0

Your error is related to RoR, not Solr. It's telling you the problem -- the requested URI is too large. WEBrick is not a production-caliber web server, and v1.9.3 appears to limit http request length to 2083 (per this other SO question.)

The short-term fix? Use a web server that doesn't limit your requested URI length to something so short.

However, that's just one part of the fix -- the process you're engaging in will grow in linear or worse fashion in terms of execution time relative to the number of results. Not only does the number of results affect performance, but also the size of the documents being retrieved.

Can you share your requirements that led to an implementation where all results are returned with each query?

Community
  • 1
  • 1
jro
  • 7,372
  • 2
  • 23
  • 36