0

I have used Lucene library as the indexing and keyword extracting tool to index my document. However, I don't want to use the search query provided in Lucene as I wish to develop my own separate searching algorithm/program.

May I know is there any way to extract only the keyword-document ID pair in a readable manners apart from using Luke (like an text file that can later be read/used in my own developed program).

OR

Is there any way to encrypt the Lucene index but still searchable using searchable encryption. If yes, may I know how will it works?

OR

Is there any other indexing library that can extract the keyword pair in the program readable manner?

andrewJames
  • 19,570
  • 8
  • 19
  • 51
supdev
  • 31
  • 1
  • 2
  • I can think of two options which may be of interest: (1) Using an analyzer to [generate index tokens](https://stackoverflow.com/questions/59723144/using-lucene-analyzer-without-indexing-is-my-approach-reasonable), without actually building the index. (2) Using the Lucene [SimpleTextCodec](https://stackoverflow.com/a/63100965/12567365) which builds a human-readable index structure. – andrewJames Jun 14 '21 at 13:16
  • Regarding encryption: I am not familiar with any options which may exist for encryption of indexed data. As a general rule when using Stack Overflow, please try to ask only one question per question. Don't forget to take the [tour] and read [ask]. Welcome! – andrewJames Jun 14 '21 at 13:18
  • Hi @andrewjames, thank you very much for the answers. My bad for asking such a long question. Thank you very much on the suggestions anyway. It does given me some insight on the keywords being extracted. Thanks again for sharing! – supdev Jun 30 '21 at 17:04

0 Answers0