52

We need to integrate a search engine in our Product Catalog management software. the catalog is expected to have more than 4-5 mn. records with relational data spread over several tables. Our dev platform is Asp.Net 3.5 and we have done some pre-liminary work on Lucene, found it to be good. However, we just came to know of Solr and was looking for some practical tips to compare Lucene & Solr from implementation, timeline, regular maintenance, performance, features perspective. Any guidance or pointers would be really helpful. Thanks.

Vikram
  • 6,865
  • 9
  • 50
  • 61

6 Answers6

41

Lucene:

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search

Solr:

Solr is an open source enterprise search server based on the Lucene Java search library, with XML/HTTP and JSON APIs, hit highlighting, faceted search, caching, replication, a web administration interface and ...

Essentially, Lucene is embedded in Solr and is purely a full-text search library, with the purpose of being embedded into projects giving them full-text search capabilities. Solr has much more features and administration capabilities, allowing to search structured data without needing to write any custom code, load data from CSV files, tolerant parsing of user input, faceted searching, highlighting matched text in results, and retrieving search results in a variety of formats (XML, JSON, ...) . Check Solr features page and see if any feature is relevant for your project.

dcruz
  • 1,056
  • 8
  • 8
  • i have created my indexes using Lucene. can those indexes still be used by Solr for search queries ? – Vikram Sep 12 '09 at 06:55
  • 1
    As in most of the cases, it dependes. It isn't automatic, you have to be sure that solr has the same fields mapping that those in the Lucene indexes. For further information, check: http://www.nabble.com/Using-Lucene-index-in-Solr-td4983079.html – dcruz Sep 12 '09 at 09:20
  • @dcruz, by any chance do you have any experience with DataImportHandler in Solr which can automatically import the data from database based on some config files. Does it works as smooth as it sounds or are there any gotchas hidden ? – Vikram Sep 14 '09 at 17:36
  • Sorry =( i worked with Solr two years ago and i don't really remember implementation details. – dcruz Sep 15 '09 at 10:15
  • Using Solr is using Lucene the right way, as Solr is Lucene best practices made by the guys that made Lucene. – Alexander Jardim Jun 30 '13 at 23:24
19

I have to agree with Andrew Clegg. I think when a lot of Java Developer types look at Lucene vs Solr, Lucene looks more friendly because it's a just a library (POJJ: Plain Old Java Jar!), like any other library and it looks straightforward to embed, versus the complexity of standing Solr up as a separate process that communicates over complex HTTP.

However, I think that for almost all search use cases, Solr is the right approach. Because most of the complexity in Search is not the direct initial integration, but in the fuzzy areas of tuning searches, scaling to meet demand, and maintaining your indexes that cross over from the developer centric world to being in the systems world. And Solr handles all of those needs nicely.

Eric Pugh
  • 1,563
  • 11
  • 17
6

Like dcruz says, Solr uses Lucene anyway, so it's not a valid comparison.

Lucene is a toolkit for building search apps, Solr is a search app built with Lucene.

IMO you'd be crazy not to use Solr, as it provides you with a lot of 'plumbing' that you'd have to write yourself otherwise -- like a configurable Data Import Handler to suck data out of your RDBMS or XML repositories.

Plus it gives you a web admin interface and other bells and whistles.

Andrew Clegg
  • 1,502
  • 3
  • 11
  • 9
  • i've used both (in asp.net), solr ie easy to setup and mantain. using lucene.net, will require a loooot more effort. On the other hand, if you need something that solr doesnt offer out of the box (if you dont know java). – robasta Jun 17 '11 at 08:15
3

One thing to consider is how difficult it will be to setup your application when you mix these two environments (Java/.NET). If you use the Lucene.NET libraries you can limit your required external dependency installs which streamlines deployment.

Another thing to consider is do you need the extras that Solr is offering? A(nother) web admin interface is probably great but it extends your risk envelope. Laying down Java and another service means more patch management. If you stick with .NET only your patch strategy can be the standard windows update model.

Of course rolling your an implementation using Lucene.NET will have development and maintenance costs of its own but in my experience it has been straight forward and easy to work with.

Ira Miller
  • 287
  • 2
  • 4
1

We are exactly in the same situation as you are. Unfortunately I was not directly involved in the evaluation process, but at the end we're going to use Solr integrated with Lucene.

The main advantage is the variety of formats as dcruz described. So you can query your Solr-Consumer and get back your search result as XML data which can be easily parsed and displayed on the webpage.

Juri
  • 32,424
  • 20
  • 102
  • 136
1

Let me shift your focus a bit: are you prepared to changes in architecture of you product? Both Lucene and Solr are implemented in Java. So you will end up running yet another web-container for hosting it (and hence will lose platform purity so to say). While Lucene was ported to .NET (Lucene.NET project), Solr was not as far as I know. If you happen to use SQL Server (which is likely, considering you platform), you might consider SQL Server Full-Text Search instead - it has almost the same features (not so feature-rich as Lucene/Solr, but anyway) and usually (in most cases) is much easier to incorporate into existing application. Besides that you benefit from simplified maintenance (it comes together with you database) and staying within single platform as well.

AlexS
  • 2,388
  • 15
  • 15
  • 6
    SQL Server FTS is *way* behind Lucene and Solr – Mauricio Scheffer Sep 17 '09 at 17:30
  • 2
    I was not saying that it is on par. But using SQL Server FTS will let you deliver the solution faster/easier and you will be staying in the boundaries of the platform. A while ago we were faced the same choice: either staying with SQL Server FTS or start using Solr. We ended up with Solr and that's why I can compare both features and the effort required to get them into your app. But everyone makes its' own decision anyway. – AlexS Sep 18 '09 at 07:43
  • @Alex, did you use DataImportHandler for configuring data importing into Solr from SQL server ? – Vikram Sep 21 '09 at 05:53
  • @Alex, thanks for your advice. We have implemented SQL FTS for a quick turnaround and have something better than SQL queries. However, we are also working on SOLR in the parallel for a long run solution. – Vikram Oct 05 '09 at 06:46