0

I am trying to design an information retrieval system of a film database. I want to search by title so when i search "Cobra Kai" my analyzer descompose this string into "cobra kai", "cobra" and "kai" to do a better pairing. So my problem is that I have to do a query like this: "cobra kai" OR "cobra" OR "kai" but it's not working for me. Here is the code:

ArrayList<String> busqueda_separada = muestraTexto(analyzer_titulo(), busquedaTitulo.getText());
                        
                        query1 = new TermQuery(new Term("titulo" ,busqueda_separada.get(0)));
                        query2 = new TermQuery(new Term("titulo" ,busqueda_separada.get(1)));
                        query3 = new TermQuery(new Term("titulo" ,busqueda_separada.get(2)));

                        nested.add(query1, BooleanClause.Occur.SHOULD);
                        nested.add(query2, BooleanClause.Occur.SHOULD);
                        nested.add(query3, BooleanClause.Occur.SHOULD);

                        bqbuilder.add(nested, BooleanClause.Occur.MUST);

And this is my error: Error

I have tried to do differents boolean clauses but it keeps the same.

leyva_7
  • 13
  • 2

1 Answers1

0

From the error message we can see that you have defined nested as a variable of type BooleanQuery.

As the error messages say, the class BooleanQuery does not have a method add(Query, Occur). This means the following line will not compile:

nested.add(query1, BooleanClause.Occur.SHOULD);

Instead, the code should be using a BooleanClause here, instead of a BooleanQuery.

One BooleanQuery is made up of one or more clauses, using BooleanClause.

So, you can do the following:

BooleanQuery.Builder bqBuilder = new BooleanQuery.Builder();

Query query1 = new TermQuery(new Term("titulo", "cobra kai"));
Query query2 = new TermQuery(new Term("titulo", "cobra"));
Query query3 = new TermQuery(new Term("titulo", "kai"));

BooleanClause nested1 = new BooleanClause(query1, BooleanClause.Occur.SHOULD);
BooleanClause nested2 = new BooleanClause(query2, BooleanClause.Occur.SHOULD);
BooleanClause nested3 = new BooleanClause(query3, BooleanClause.Occur.SHOULD);

bqBuilder.add(nested1);
bqBuilder.add(nested2);
bqBuilder.add(nested3);

BooleanQuery bq = bqBuilder.build();

That builds a boolean query containing 3 clauses:

Find titles containing "cobra kai" OR "cobra" OR "kai".

I am not sure what this is for:

bqbuilder.add(nested, BooleanClause.Occur.MUST);

The BooleanClause.Occur.MUST does not appear to be needed, so I have dropped it from my code.


You can simplify the above code by using a loop.

Assuming you already have a list containing your search terms (your busqueda_separada list):

List<String> terms = Arrays.asList("cobra kai", "cobra", "kai");

You can use that list as follows:

for (String term : terms) {
    Query query = new TermQuery(new Term("titulo", term));
    BooleanClause nested = new BooleanClause(query, BooleanClause.Occur.SHOULD);
    bqBuilder.add(nested);
}
BooleanQuery bq2 = bqBuilder.build();

Update

One point I forgot to mention:

In your data, you have a search phrase: cobra kai. It's possible that you do not need to search for this, depending on how your data was indexed, and how you expect your search to work.

But assuming you do need it, you need to wrap the phrase in double-quotes, so that it is treated as a single search phrase by Lucene:

List<String> terms = Arrays.asList("\"cobra kai\"", "cobra", "kai");

This ensures the generated search is:

titulo:"cobra kai" titulo:cobra titulo:kai

And, by default, there is an implied "OR" in between each clause in the search.


Your "extra" question:

query should be like (titulo=“cobra kai” OR titulo=“cobra” OR titulo=“kai”) AND anio=“2018”

This is really a completely new question and you can see approaches in existing answers such as:

But one more approach (if I have understood correctly) is to nest 2 queries inside another boolean query and use Occur.MUST in that outer query for each clause.

So, you already have your first boolean query.

Now create another one. Actually if you only have one term, you don't even need a boolean query - just a term query:

Query query2 = new TermQuery(new Term("year", "2018"));

Now place these two queries into a brand new boolean query (this new query contains the first two queries):

BooleanQuery.Builder bqBuilder = new BooleanQuery.Builder();
bqBuilder.add(bq1, BooleanClause.Occur.MUST);
bqBuilder.add(query2, BooleanClause.Occur.MUST);
BooleanQuery bq = bqBuilder.build();

The above is equivalent to the following Lucene classic query:

+(body:"cobra kai" body:cobra body:kai) +year:2018

And that, in turn, is equivalent to:

(body:"cobra kai" OR body:cobra OR body:kai) AND year:2018

Note that this uses the plus operator.

So the results MUST contain matches for both clauses - the clause for my body field and the clause for my year field.


This can all get quite confusing if you think about Lucene boolean operators in the same way that you think about Boolean algebra. But they are not the same and serve different purposes. Lucene is not (only) about including and excluding records, but about scoring those records for relevance.

andrewJames
  • 19,570
  • 8
  • 19
  • 51
  • Thank you very much for your answer, I still have a problem that I forgot to mention. How can I combine this query with other attributes that must appear. For example: “titulo: cobra kai”, that problem is solved and let’s say I have another field that shows the film year (“anio”). So the query will be “titulo: cobra kai AND anio:2018”, how can I do this combining the Boolean querys? Thank you so much. – leyva_7 Nov 26 '22 at 19:00
  • The query should be like (titulo=“cobra kai” OR titulo=“cobra” OR titulo=“kai”) AND anio=“2018” – leyva_7 Nov 26 '22 at 19:06
  • That is really a completely new question - but I added some notes to the answer. Feel free to ask a new question if these notes do not help - and if your research into existing questions also does not help. – andrewJames Nov 26 '22 at 19:46