Using the carrot2 workbench without specifying a query

Question

I would like to use the workbench to do some tests but I could not understand how to run it without specifying a query. I would like to make the cluster of documents, without having to specify a query using the workbench. Is it possible?

Tanks

score 0 · Answer 1 · edited May 23 '17 at 10:33

0

The two simplest options are the following:

Convert your data to Carrot2 XML format and use the XML document source in Workbench, where the query is optional.
Create a Lucene index out of your data and use the Lucene document source. In this case the query is mandatory, but you can use the *:* catch-all query to cluster all documents from the index. This question has some hints about converting different types of document into the required Lucene index.

edited May 23 '17 at 10:33

Community

1
1

answered Dec 12 '14 at 21:14

Stanislaw Osinski

1,231
1
7
9

I tried using the query `*:*`, but occurred the following exception: `INFO An object of a non-@ThreadSafe class org.apache.lucene.store.NIOFSDirectory bound at initialization-time to attribute LuceneDocumentSource.directory. Make sure this is intended. Exception in thread "main" java.lang.NoSuchMethodError: org.apache.lucene.index.FieldInfo.(Ljava/lang/String;ZIZZZLorg/apache/lucene/index/FieldInfo$IndexOptions;Lorg/apache/lucene/index/FieldInfo$DocValuesType;Lorg/apache/lucene/index/FieldInfo$DocValuesType;Ljava/util/Map;)V` I'm using carrot2 3.9.3 version – msoares Dec 16 '14 at 12:26
No.This happens using the Java API Examples. Using the workbench happens another error: `Processing error: ... Format version is not supported (resource: NIOFSIndexInput(path=".../lucene-index-path/segments.gen")): -3 (needs to be between -2 and -2)` I am stuck in any of the situations – msoares Dec 17 '14 at 10:51
1

This is probably because you replaced Lucene JAR with a newer version. If you'd like to upgrade Lucene, you'll need to recompile Carrot2 from sources: http://doc.carrot2.org/#section.advanced-topics.building-from-source-code. – Stanislaw Osinski Dec 19 '14 at 08:47

Using the carrot2 workbench without specifying a query

1 Answers1