0

I would like to use the workbench to do some tests but I could not understand how to run it without specifying a query. I would like to make the cluster of documents, without having to specify a query using the workbench. Is it possible?

Tanks

msoares
  • 13
  • 4

1 Answers1

0

The two simplest options are the following:

  1. Convert your data to Carrot2 XML format and use the XML document source in Workbench, where the query is optional.

  2. Create a Lucene index out of your data and use the Lucene document source. In this case the query is mandatory, but you can use the *:* catch-all query to cluster all documents from the index. This question has some hints about converting different types of document into the required Lucene index.

Community
  • 1
  • 1
Stanislaw Osinski
  • 1,231
  • 1
  • 7
  • 9
  • I tried using the query `*:*`, but occurred the following exception: `INFO An object of a non-@ThreadSafe class org.apache.lucene.store.NIOFSDirectory bound at initialization-time to attribute LuceneDocumentSource.directory. Make sure this is intended. Exception in thread "main" java.lang.NoSuchMethodError: org.apache.lucene.index.FieldInfo.(Ljava/lang/String;ZIZZZLorg/apache/lucene/index/FieldInfo$IndexOptions;Lorg/apache/lucene/index/FieldInfo$DocValuesType;Lorg/apache/lucene/index/FieldInfo$DocValuesType;Ljava/util/Map;)V` I'm using carrot2 3.9.3 version – msoares Dec 16 '14 at 12:26
  • No.This happens using the Java API Examples. Using the workbench happens another error: `Processing error: ... Format version is not supported (resource: NIOFSIndexInput(path=".../lucene-index-path/segments.gen")): -3 (needs to be between -2 and -2)` I am stuck in any of the situations – msoares Dec 17 '14 at 10:51
  • 1
    This is probably because you replaced Lucene JAR with a newer version. If you'd like to upgrade Lucene, you'll need to recompile Carrot2 from sources: http://doc.carrot2.org/#section.advanced-topics.building-from-source-code. – Stanislaw Osinski Dec 19 '14 at 08:47