0

I'm using grails 3.1.5, I need to use solr to extract keywords from users's uploaded documents to check if they're valide or not, and if it is possible to check the keyword position in the picture, I could't find a working plugin of solr for grails 3+ so I think I will have to do it manually with groovy/java, is there a way to do what i'm asking for ? Solr wasn't my choice but the client's , is there another way to do what I'm asking for ? (Still prefering using solr)
Thank you

Burt Beckwith
  • 75,342
  • 5
  • 143
  • 156
hereForLearing
  • 1,209
  • 1
  • 15
  • 33
  • You need to add the Solr dependency to your project. Search for it on http://search.maven.org. Then, add the dependency to your build.gradle. https://docs.gradle.org/current/userguide/artifact_dependencies_tutorial.html#N105EC – Emmanuel Rosa May 09 '16 at 22:39
  • thank you for your answer, but is what i'm asking for is doable with solr and grails? – hereForLearing May 12 '16 at 13:39
  • Do you know what Solr does? – Emmanuel Rosa May 12 '16 at 13:57
  • I think it's mainly used for making searches faster by pre indexing the possible results, but I think that it can be used for reading pictures content as well am I wrong? – hereForLearing May 12 '16 at 14:02
  • 1
    Yes, Solr is a search engine, and simply put it works like this: When content is added to Solr, it hands the content (document, image, etc) over to Apache Tika. Tika then chooses an appropriate parser, extracts fields from the content, and passes it back to Solr in a format it understands. Solr then uses the extracted fields (author, create date, etc) and adds them to its index. When Tika is feed an image, it parses its **metadata**. It will not do anything with the raster image itself. See http://tika.apache.org/1.12/formats.html#Image_formats – Emmanuel Rosa May 12 '16 at 14:20
  • oh, so it can't do what I want ? I mean searching keywords in picture content to to verify if its valide ? what can I use with grails to do that? – hereForLearing May 12 '16 at 14:22
  • 1
    I'm not clear on what you want, but if it's to read the image (photo) data itself, then no Solr cannot do that. It can only read metadata; fields added to the image such as where it was taken, and who took the photo, etc. To grab text from images, you can use Tess4J. WARNING: OCR is a real pain to get working right, if it works at all. – Emmanuel Rosa May 12 '16 at 14:28
  • Ok thank you sir, is there any plugin for that? do you have a better idea other then using OCR ? I need to search keywords in picture or check picture form (check if the document form ressembles to a sample that I have) – hereForLearing May 12 '16 at 14:34
  • and I've found this : http://blog.thedigitalgroup.com/vijaym/using-solr-and-tikaocr-to-search-text-inside-an-image/ what do you think ? – hereForLearing May 12 '16 at 14:38
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/111779/discussion-between-saywow-and-emmanuel-rosa). – hereForLearing May 12 '16 at 14:38
  • Have a read here http://stackoverflow.com/questions/21773189/using-solr-to-calculate-similarity-bitcount-between-two-ulongs that points to algorithms that may be useful. – cheffe May 13 '16 at 06:25

0 Answers0