4

I have file content which is currently as byte[] and I want to know it's mime-type / content-type header.

The issue is that, I saw many examples over the web to find it but only when the file is actually exists.

How can I get this info by only this byte[].

Most of the codes which I tested are like:

File f = new File("gumby.gif");
System.out.println("Mime Type: " + new MimetypesFileTypeMap().getContentType(f));

EDIT 1: This code gave me a wrong result which is: application/octet-stream

Is there a way to get this info without creating a file ?

EDIT 2: I tried this code:

public static void main(String[] args) throws Exception{

    Tika tika = new Tika();     
    byte[] content = readFile().getBytes();     
    System.out.println(tika.detect(content));
}    

private static String readFile() throws Exception{
    BufferedReader br = new BufferedReader(new FileReader("c:\\pic.jpg"));
    try {
        StringBuilder sb = new StringBuilder();
        String line = br.readLine();

        while (line != null) {
            sb.append(line);
            sb.append(System.lineSeparator());
            line = br.readLine();
        }
        String everything = sb.toString();
        return everything;
    } finally {
        br.close();
    }       
}

And it's always returns application/octet-stream, is it because i'm using byte[] as parameter?

The Dr.
  • 556
  • 1
  • 11
  • 24
  • Have you seen any ways to identify the MIME type of an existing file in a Java program (without calling external native executables)? – V G Nov 18 '15 at 15:21
  • no, most of the examples are creating it – The Dr. Nov 18 '15 at 15:22
  • Since methods from the JDK are based on the filename, they will not help... Can your byte array contain any type of file or do you know beforehand that it will be chosen among a small set of file types? – StephaneM Nov 18 '15 at 15:25
  • I know a single way to find that out in a secure way (e.g without trusting the Browser or blindly extract the extension): `file` from Linux. – V G Nov 18 '15 at 15:32
  • @StephaneM no, it's an unknown content of any file type – The Dr. Nov 18 '15 at 15:34
  • 5
    Then you'll probably have to use a library that parses the header of files to guess its content. Apache tika seems to be what you need (https://tika.apache.org/) – StephaneM Nov 18 '15 at 15:50
  • 3
    This seems to be a duplicate. I have flagged it for closing. You can find all the ways in this question. [Getting A File's Mime Type In Java](http://stackoverflow.com/questions/51438/getting-a-files-mime-type-in-java) – vsnyc Nov 18 '15 at 16:04
  • @StephaneM could you please share an example of apache-tika which isn't using `File` in order to get what I need ? – The Dr. Nov 18 '15 at 16:11
  • 1
    Just read the doc: https://tika.apache.org/1.11/api/org/apache/tika/Tika.html#detect%28byte[]%29 – StephaneM Nov 19 '15 at 06:53
  • @StephaneM please have a look at my 2nd edit in the original post – The Dr. Nov 19 '15 at 08:34
  • @gamil I'm not sure what you are after here but tika and possibly java nio implementations will attempt to detect the type of file based on both the extension and the content. From your example it seems like you have the extension so if you attempt to pass that information to the detectors that will help them to give you a better result. Detecting a file type based on content is not an "exact science" since there isn't one way of writing content, compressing it, encrypting it, etc. – ricardoespsanto Nov 19 '15 at 10:15

0 Answers0