1

I am asked to either deploy or develop an enterprise (intranet) search engine which could index all web pages of a couple of internal servers, and have a search portal to display all related content, like what Google is doing but for intranet.

Any advice how to develop or deploy quickly? I have heard of Microsoft FAST product, not sure whether it is for this purpose?

thanks in advance, George

Jim W
  • 4,890
  • 2
  • 20
  • 26
George2
  • 44,761
  • 110
  • 317
  • 455
  • The intranet part of it is called Enterprise Search for Business Productivity: http://www.microsoft.com/enterprisesearch/en/us/business-productivity.aspx – Dan Gøran Lunde Aug 15 '09 at 18:16

5 Answers5

3

Depending on the level of polish you need, the Nutch project would be an almost turn-key solution for you. http://lucene.apache.org/nutch/

Kevin Peterson
  • 7,189
  • 5
  • 36
  • 43
  • What do you mean "level of polish you need"? – George2 Jul 27 '09 at 07:06
  • 1
    You'll probably need to write your own front end. I'm guessing, but from related tools (Solr) the interface is probably going to look like something on an engineer could use. – Kevin Peterson Jul 27 '09 at 07:35
  • Thanks pb! This is just what I want. If I need to customize the ranking part or some other relevance matching part, any APIs provided by Nutch? Is it easy to extend? My requirement is I need to develop some language and industry specific search, so need some special key words extraction, ranking, etc. Any advice? – George2 Jul 27 '09 at 08:08
2

The google search appliance is a hardware solution that you might be interested in checking out.

A software based approach could be the Lucene search engine.

lomaxx
  • 113,627
  • 57
  • 144
  • 179
  • Cool, and both of them have built-in relevant and ranking algorithms? – George2 Jul 27 '09 at 07:07
  • I don't think Lucene is that sophisticated. It's just a very good keyword searcher. (Not knocking it, I've used it on more than one project.) – Rex M Jul 31 '09 at 03:09
2

A free Microsoft solution is Microsoft Search Server Express. Works similar to the search in SharePoint.

Paul van Brenk
  • 7,450
  • 2
  • 33
  • 38
  • Looks like Windows Search Server Express could only support crawl content from SharePoint and run on top of SharePoint? – George2 Jul 27 '09 at 07:01
  • 1
    Index content on file servers, Web sites, Windows SharePoint Services, Microsoft Office SharePoint Server, Exchange Server public folders, and Lotus Notes repositories. And is a standalone install. – Paul van Brenk Jul 27 '09 at 07:12
  • Thanks pb! This is just what I want. If I need to customize the ranking part or some other relevance matching part, any APIs? – George2 Jul 27 '09 at 07:25
  • Don't know. Only used the OOB functionality. – Paul van Brenk Jul 27 '09 at 17:12
0

George,

It sounds like you're in a big hurry.

You better start setting expectations on re-work, re-work, re-work.

I highly recommend that you spend time now to

  • establish the requirments, possibly as basic, middle and blue-sky

  • determine what search engines, front-ends, crawlers, etc., (either open-source or vendor-provided), can really met your requirments

  • determine the available support for those tools, and the likelyhood of getting timely and workable answers or work-arounds (Open-source at least doesn't come this a support contract)

  • don't try to do it all at once. Do the smallest data-set first, regardless of how far up in mgmt your sponsor is. You won't have spent months doing tests only to discover a fatal large-scale flaw in the system, or your plan

  • comnunicate to your team and sponsors by creating a roadmap to your various levels or requirments, with check-points

  • As far a pre-planning for even a small-to-medium corporate search project, I highly recommend Martin White's , 'Making Search Work'.

http://www.amazon.com/Making-Search-Work-Implementing-Enterprise/dp/1573873055/ref=sr_1_1?ie=UTF8&qid=1249009370&sr=8-1

I think you'll find that the ranking and relevance are the one of the if-iest parts of getting a good search solution delivered. Engines probably provide similar functionalities, but the details of how to do it will be different, AND more importantly, the success that you have with forcing relevance will only partly be a function of the search engine that you pick. Put another way, if your text is not in harmony with the search-engines algorithm, you'll spend a lot of time trying to understand various tuning parameters, and their combinatorics. (I'm only familiar with 2 so far, so others are welcome to contradict this).

It's a great learning experience. Good luck.

user141107
  • 11
  • 1
0

FAST is a great enterprise search product. It usually ranks top on all the consulting firms evaluations. It does require a moderate amount of technical setup and support though.

Google is another solid product but it is very expensive. It requires a less technical support, but also gives you less control of the search results.

DMurph11
  • 33
  • 1
  • 4