2

I am working on testing Apache lucene for text based search in our project. Unfortunately, I am having problem with missing libraries. I tried adding the lucene-queries, but that didn't help. What am I doing wrong?

ErrorLog :

Caused by: java.lang.NoClassDefFoundError: org/apache/lucene/search/similarities/DefaultSimilarity
    at org.hibernate.search.spi.SearchIntegratorBuilder.createCleanFactoryState(SearchIntegratorBuilder.java:287)
    at org.hibernate.search.spi.SearchIntegratorBuilder.buildNewSearchFactory(SearchIntegratorBuilder.java:186)
    at org.hibernate.search.spi.SearchIntegratorBuilder.buildSearchIntegrator(SearchIntegratorBuilder.java:117)
    at org.hibernate.search.hcore.impl.HibernateSearchSessionFactoryObserver.sessionFactoryCreated(HibernateSearchSessionFactoryObserver.java:66)
    at org.hibernate.internal.SessionFactoryObserverChain.sessionFactoryCreated(SessionFactoryObserverChain.java:52)
    at org.hibernate.internal.SessionFactoryImpl.<init>(SessionFactoryImpl.java:588)
    at org.hibernate.cfg.Configuration.buildSessionFactory(Configuration.java:1859)
    at org.hibernate.cfg.Configuration.buildSessionFactory(Configuration.java:1930)
    at org.springframework.orm.hibernate4.LocalSessionFactoryBuilder.buildSessionFactory(LocalSessionFactoryBuilder.java:372)
    at org.springframework.orm.hibernate4.LocalSessionFactoryBean.buildSessionFactory(LocalSessionFactoryBean.java:454)
    at org.springframework.orm.hibernate4.LocalSessionFactoryBean.afterPropertiesSet(LocalSessionFactoryBean.java:439)
    at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.invokeInitMethods(AbstractAutowireCapableBeanFactory.java:1633)
    at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.initializeBean(AbstractAutowireCapableBeanFactory.java:1570)
    ... 98 more
Caused by: java.lang.ClassNotFoundException: org.apache.lucene.search.similarities.DefaultSimilarity
    at org.apache.catalina.loader.WebappClassLoaderBase.loadClass(WebappClassLoaderBase.java:1858)
    at org.apache.catalina.loader.WebappClassLoaderBase.loadClass(WebappClassLoaderBase.java:1701)
    ... 111 more

POM.xml :

  <!-- https://mvnrepository.com/artifact/org.apache.lucene/lucene-core -->
        <dependency>
            <groupId>org.apache.lucene</groupId>
            <artifactId>lucene-core</artifactId>
            <version>6.4.2</version>
        </dependency>

Code I am trying :

  @Override
    public void saveIndexes() {
        //Apache Lucene Indexing Directory .txt files
        try {
            //indexing directory
            Path path = Paths.get("/home/akshay/index/");
            Directory directory = org.apache.lucene.store.FSDirectory.open(path);
            IndexWriterConfig config = new IndexWriterConfig(new SimpleAnalyzer());
            IndexWriter indexWriter = new IndexWriter(directory, config);
            indexWriter.deleteAll();
            File f = new File("/home/akshay/textfiles/"); // current directory
            for (File file : f.listFiles()) {
                System.out.println("indexed " + file.getCanonicalPath());
                org.apache.lucene.document.Document doc = new org.apache.lucene.document.Document();
                doc.add(new TextField("path", file.getName(), Field.Store.YES));
                FileInputStream is = new FileInputStream(file);
                BufferedReader reader = new BufferedReader(new InputStreamReader(is));
                StringBuffer stringBuffer = new StringBuffer();
                String line;
                while((line = reader.readLine())!=null){
                    stringBuffer.append(line).append("\n");
                }
                reader.close();
                doc.add(new TextField("contents", stringBuffer.toString(), Field.Store.YES));
                indexWriter.addDocument(doc);
            }
            indexWriter.close();
            directory.close();
        } catch (Exception e) {
            // TODO: handle exception
            e.printStackTrace();
        }
    }

    @Override
    public void searchLucene(String text) {
        //Apache Lucene searching text inside .txt files
        try {
            Path path = Paths.get("/home/akshay/index/");
            Directory directory = FSDirectory.open(path);
            IndexReader indexReader =  DirectoryReader.open(directory);
            IndexSearcher indexSearcher = new IndexSearcher(indexReader);
            QueryParser queryParser = new QueryParser("contents",  new StandardAnalyzer());
            Query query = queryParser.parse(text);
            TopDocs topDocs = indexSearcher.search(query,10);
            System.out.println("totalHits " + topDocs.totalHits);
            for (ScoreDoc scoreDoc : topDocs.scoreDocs) {
                org.apache.lucene.document.Document document = indexSearcher.doc(scoreDoc.doc);
                System.out.println("path " + document.get("path"));
                System.out.println("content " + document.get("contents"));
            }
        } catch (Exception e) {
            // TODO: handle exception
            e.printStackTrace();
        }
    }

Any ideas, thank you. :-)

We are Borg
  • 5,117
  • 17
  • 102
  • 225
  • what are you running when getting this error? Tomcat? are you sure, that lucene core is there in the classpath? – Mysterion Mar 22 '17 at 14:39

1 Answers1

3

Actually the class was deprecated already in 5.4.1, when looking for it in the 6.4.2 version it does not exist anymore, see the message:

Use ClassicSimilarity for equivilent behavior, or consider switching to BM25Similarity which will become the new default in Lucene 6.0

See also:

  • LUCENE-6789: IndexSearcher's default Similarity is changed to BM25Similarity. Use ClassicSimilarity to get the old vector space DefaultSimilarity. (Robert Muir)

Either downgrade your lucene core dependency to 5.5.4, or use in your code either ClassicSimilarity or BM25Similarity

Adonis
  • 4,670
  • 3
  • 37
  • 57
  • Thank you for your answer, when I add 5.4.1, I get Caused by: java.lang.NoSuchFieldError: LUCENE_3_1. Do you require complete error log? I have changed nothing in code. Thank you. – We are Borg Mar 23 '17 at 08:00
  • The stacktrace is likely to be helpful – Adonis Mar 23 '17 at 08:01
  • I had one Hibernate-orm library, which was conflicting with lucene, I found that out with mvn dependency:tree, and removed it, as it was no longer used in project. Can you tell me how can I instruct a library to used the jar downloaded by other dependency. Thanks. – We are Borg Mar 23 '17 at 08:22
  • By default it should be the case. The only case I could see it not happening is if you marked this dependency as optional. See http://stackoverflow.com/a/2916003/4121573 – Adonis Mar 23 '17 at 08:26
  • Did it help? Otherwise I would need a closer look to your pom to get a better understanding – Adonis Mar 23 '17 at 08:39
  • Just one last question : FOr Lucene, ones the files are indexed, do they need to be present while search, I actually just need the file-names, as they are associated in DB. Thanks.... – We are Borg Mar 23 '17 at 09:01
  • @WeareBorg To be fair I don't have much experience directly with Lucene, but given that in other APIs (such as Elasticsearch) you can specify the fields not to be analyzed (so basically making them "unsearchable"), I'm pretty sure there should be a way to specify that to Lucene. Sorry I can't be of much help there – Adonis Mar 23 '17 at 09:26