1

I have a ruby 1.9 rails 3.0.7 application that is using lucid/solr to index large amounts of text data (3GB or so). The data is stored in a MongoDB database and consists mainly of emails.

One issue I'm having is that I'm trying to index the entire data initially when I establish the application so I can search it. This is a process that will actually be repeated quite often, so I have to figure out how to index the entire MongoDB database quickly and efficiently into solr. According to the solr docs, one of the main ways to expedite the indexing process is to use multiple cores. I ran the index on a single core VM and it took about 1 hour to index the data I have. When I moved it to a 4 core VM and ran it it took about 1 hour as well. I didn't notice any discernible difference between the 2.

This leads me to suspect that maybe ruby 1.9 is NOT capable of using multiple cores properly? I'm using a Linux Ubuntu 10.10 VM.

I've read some posts that mention ruby 1.9 is a different multi-core functionality than 1.8 but I admit this is not an area I'm very knowledgeable about.

Does anyone know if ruby 1.9 is indeed capable of taking advantage of multiple cores for indexing large amounts of data in solr?

Dan L
  • 4,319
  • 5
  • 41
  • 74

1 Answers1

1

According to this question and this, it can run on all the cores, as long as the thread frees something called Giant VM Lock.

Since this probably depends on the gems (and thus C-extensions) you're using, I would suggest you to do some testing to check that it's actually using all the cores, and in the case that it's not doing it, maybe move to JRuby, which should use all the cores OOB.

I know that this is not a definitive answer, but I hope it helps you to find out a solution.

Community
  • 1
  • 1
Augusto
  • 28,839
  • 5
  • 58
  • 88
  • No, but I just realized that it's called in both ways :D - [Giant VM Lock](http://www.google.co.uk/search?q=giant+vm+lock) on google: You can see posts in InfoQ and Artima calling it in that way. I think we can agree that it's indeed Giant :D – Augusto Sep 15 '11 at 12:54