2

We are moving from coldfusion 9 verity to coldfusion 10 solr in an application that searches PDF files that have metadata attached to them. We get very different results in testing. We eventually found that solr is searching the file contents very differently than verity. Is there something we can "tweak" to make solr searching more efficient or get them to search the same? We sometimes get very different results searching on a single word, not just multiple words.

Edit: I found that the PDFs I was using were mostly really old and after doing a batch save in Acrobat to resave them as another version, I got much better results. That and the default operator in Verity was AND and SOLR is OR, so I changed that in the jetty config file under my collection's folder. This all helped, but other than that, I'm getting little differences here and there. I hope this helps or helps anybody else that sees this post. Not sure what else I can do to really "tune" it.

  • I've been disappointed with the way CF's default implementation of Solr handles PDFs too. It may be possible to improve it by editing the schema/config xml files, but I ended up using CFPDF to extract the text and metadata from the files and store them in the database. I then used Solr to index that data, with better results. – CfSimplicity Nov 24 '13 at 09:39
  • Thanks for the info! The last issue I'm having is verity will search for "Coldfusion Solr" and look for those two words in a document. SOLR by default will look for any documents that have those two words in them, but they don't have to be next to each other. – user2305795 Nov 25 '13 at 16:16
  • I'm not sure how `` is utilizing the underlying Solr engine. You might try using `%22` (the URL-encoded value) in place of a double quote `"`. – David Faber Jan 05 '15 at 22:27

0 Answers0