1

I want to integrate my Solr data core with carrot2, to get a nice clustered visualization. However, I am having difficulties with getting carrot2 running in the first place as the documentation I have come across is rather vague. What is needed exactly? In other words, how do I get started?

I have downloaded the latest release of carrot2 from https://github.com/carrot2/carrot2/releases

I cannot understand how to get it running with the solr core that I have already created. What is the next step? Are there any instructions on how to do this exactly?

jps
  • 20,041
  • 15
  • 75
  • 79
blah
  • 674
  • 3
  • 17
  • 1
    I have the same issue with `carrot2 4.1.0`, when I try to insert my local solr instance URL into the `Solr service URL`, I get this error `TypeError: NetworkError when attempting to fetch resource`. – Soufiane Roui Jan 13 '21 at 13:33
  • 1
    I decided to work with this version: carrot2-workbench-3.16.2 application. When you run this workbench .exe file, the app should open. You then put in the URL of the Solr core you would like to cluster. I have not used newer releases. – blah Jan 13 '21 at 13:57
  • Yes, before using ```4.1.0``` I have used the ```3.16.2```. I just want to use the newer version in case there are some enhancements. anyway, now I save my solr search result into a csv file, then I convert it to an excel file, then I load it with the ```4.1.0``` version and it works fine. – Soufiane Roui Jan 13 '21 at 14:34
  • is there any alternatives to carrot2 in the java ecosystem? – Soufiane Roui Jan 13 '21 at 14:36
  • i dont have any experience with java and my carrot work is very limited. I mainly use Python and Solr and tried to integrate carrot2 to it. Ask the person who answered my question below, as I believe he is one of the developers of the carrot software. – blah Jan 14 '21 at 09:41

1 Answers1

1

Carrot2 Workbench was not available in the 4.0.x release, but a browser-based Workbench will be part of the upcoming 4.1.0 release.

The 4.1.0 is not yet officially available, but you can use snapshot binaries for the time being.

To cluster Solr data using the snapshot release Workbench:

  1. Download Carrot2 4.1.0 snapshot binaries, unzip in a local folder.

  2. Go to the dcs directory, run the dcs.cmd or dcs.sh depending on your operating system.

  3. Open http://localhost:8080/frontend/#/workbench in a modern browser.

  4. Choose Solr in the Data source combo box, fill in Solr service URL.

  5. If everything worked correctly, Workbench should be able to load the list of cores in your Solr install. Choose the core, choose the fields to cluster, type your query and press Cluster.

Stanislaw Osinski
  • 1,231
  • 1
  • 7
  • 9
  • when opening the localhost:8080 in the browser, I correctly get DCS running there. However, when I select Solr as Document Source, I do not get the option to fill int he Solr service URL. Where exactly am I supposed to fill this in? – blah Dec 15 '20 at 09:35
  • This answer relates to the not yet released 4.1.0 version, the snapshot download URL is provided in point 1. of the list. – Stanislaw Osinski Dec 15 '20 at 10:28
  • I decided to use the workbench app from an earlier release and that works better. It was a bit of a nightmare to find and follow appropriate documentation. When I click on the link the URL in point 1, it is broken "Page not found" Github. Thanks for your help though. – blah Dec 15 '20 at 10:35
  • 1
    Oh, it appears that build artifact downloads are only available to signed-in GitHub users. I uploaded the snapshot to our server and edited the answer to contain the corrected URL. We'll be replacing the 3.x Workbench with a more flexible browser-based version (allowing, amongst others, to choose arbitrary fields for clustering vs. title+snippet only). But if the 3.x Workbench does the job, then great! – Stanislaw Osinski Dec 15 '20 at 10:57