0

I have a simple class that Utilizes Apache Tika 1.14, and it is shown here:

import java.io.File;
import java.io.IOException;

import org.apache.tika.Tika;
import org.apache.tika.exception.TikaException;
import org.apache.tika.mime.*;
import org.xml.sax.SAXException;
import org.apache.tika.config.*;


public class TikaExtraction {

   public static void main(final String[] args) throws IOException, TikaException {

      //Assume sample.txt is in your current directory              
      File file = new File("sample.txt");

      //Instantiating Tika facade class
      Tika tika = new Tika();
      String filecontent = tika.parseToString(file);
      System.out.println("Extracted Content: " + filecontent);
   }

}

However, when I try to run it, I am getting the following error message:

Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/tika/mime/MimeTypesReader at org.apache.tika.mime.MimeTypesFactory.create(MimeTypesFactory.java:158) at org.apache.tika.mime.MimeTypes.getDefaultMimeTypes(MimeTypes.java:577) at org.apache.tika.config.TikaConfig.getDefaultMimeTypes(TikaConfig.java:73) at org.apache.tika.config.TikaConfig.(TikaConfig.java:222) at org.apache.tika.config.TikaConfig.getDefaultConfig(TikaConfig.java:345) at org.apache.tika.Tika.(Tika.java:116) at TikaExtraction.main(TikaExtraction.java:17) Caused by: java.lang.ClassNotFoundException: org.apache.tika.mime.MimeTypesReader at java.net.URLClassLoader.findClass(URLClassLoader.java:381) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) ... 7 more

When I search through the Tika library code, I am not seeing the .class file for the MimeTypesReader class. Is there a way to comment out some code or fix this - how would I resolve this error message?

Or could this be an issue with the version of Tika that I'm using (i.e the referenced code was from 1.6, and maybe I have 1.13 or 1.14).

Dimitar
  • 4,402
  • 4
  • 31
  • 47
Caffeinated
  • 11,982
  • 40
  • 122
  • 216

1 Answers1

2

With such type of library errors, it is not removing something, but adding something that is missing, hence the ClassNotFoundError. Most of the time, you are either missing some supporting jar or there are compatibility issues due to library updates. In fact, you said it yourself

When I search through the Tika library code, I am not seeing the .class file for the MimeTypesReader class.

This is the library method that throws the exception:

/**
 * Creates and returns a MimeTypes instance from the specified document.
 * @throws MimeTypeException if the type configuration is invalid
 */
public static MimeTypes create(Document document) throws MimeTypeException {
    MimeTypes mimeTypes = new MimeTypes();

    //For some reason the MimeTypesReader is missing
    new MimeTypesReader(mimeTypes).read(document);
    mimeTypes.init();
    return mimeTypes;
}

Make sure that you have supplied the full library and also the latest for that matter - tika-1.14. You could get it from one of these Apache mirrors.

The source is part of the tika-core.jar, so make sure you have that on too.

Dimitar
  • 4,402
  • 4
  • 31
  • 47