17

This article offered me a huge amount of information:
Implement Lucene on Existing .NET / SQL Server stack with multiple webservers

I'd like to follow on from this by asking about the notion of implementing a Lucene Directory that would persist the indexes to the database (in my case SQL Server) - if anyone has a SWAG on effort that would be helpful.

I can see that the Java realm has this (e.g. Compass), and I'm really hoping the Stackoverflow folks might have considered this to? Any feedback would be appreciated.

My rookie thinking is that persisting indexes to the DB would be a way to solve for the 'distribution' problem. So instead of implementing messaging (not possible for my software because of deployment restrictions), or scheduling (would be ok'ish - product folks always get jumpy in making decisions about how 'current' indexed data has to be), the IndexReader reopen() would efficiently update the index snapshot on whichever server node.

Does this work if DB concurrency/load is not the heart of the problem being solved? - our use is focused around facilitating different data analysis on fields which in turns facilitates different forms of matching.

Our deployment architecture/restrictions do not really allow us to insist on dedicated servers ala SOLR, so this notion of distribution has been discounted by us.

Community
  • 1
  • 1
user206830
  • 189
  • 7
  • This doesn't answer your question directly, but it seems that someone implemented a sql server directory on Java (using JDBC). Perhaps you can look at the source code to estimate how long it would take you to write. Or, you could use solr as this article suggests: http://www.chrisumbel.com/article/lucene_solr_sql_server – agent-j May 31 '12 at 22:17
  • What would be the benefit of using Lucene.NET then? Why don't you simply use SQL Server FullText? – Simon Mourier Aug 08 '12 at 12:02

3 Answers3

0

How much index changes do you await? When do you want to read in the index? (On application startup?) Putting the index into the database and "downloading" it on index creation might consume too much resources.

Not sure about your deployment restrictions, but can you have a shared file space for your machines (e.g. SMB/NFS share or similar, or even a SAN-based solution)?

Matthias Wuttke
  • 1,982
  • 2
  • 21
  • 38
0

I would be a bit afraid of performance issues with the indexes in the db. Have a look at Elasticsearch. It's the successor of compass. It requires Java, but has a very neat REST interface for your .NET solution. Elasticsearch supports distribution and replication between several nodes. You can run it on the webserver nodes.

j.hedin
  • 105
  • 7
0

This solution will kill performance of the index, since it has to retrieve it from the DB. I would highly recommend moving to a newer/better alternative, that is Solr (using Solr.NET for example) or ElasticSearch (using NEST)

Solr is a high level interface/manager for Lucene indexes, with a simplified configuration, clustering, replication, etc. solved for you. The nice thing is that if you have some exp. with Lucene, this will not be such a big step

ElasticSearch is a different approach but it's not hard to learn.

KinSlayerUY
  • 1,903
  • 17
  • 22