0

I'm attempting to read magic numbers/bytes to check the format of a file. Will reading a file byte by byte work in the same way on a Linux machine?

Edit: The following shows to get the magic bytes from a class file using an int. I'm trying to do the same for a variable number of bytes.

http://www.rgagnon.com/javadetails/java-0544.html

James P.
  • 19,313
  • 27
  • 97
  • 155
  • 1
    Thos question may help you: http://stackoverflow.com/questions/1915317/java-howto-extract-mimetype-from-a-byte – PeterMmm May 08 '11 at 15:16

1 Answers1

1

I'm not sure that I understand what you are trying to do, but it sounds like what you are trying to do isn't the same thing as what the code that you are linking to is doing.

The java class format is specified to start with a magic number, so that code can only be used to verify if a file might be a java class or not. You can't use the same logic and apply it to arbritraty file formats.

Edit: .. or do you only want to check for wav files?

Edit2: Everything in Java is in big endian, that means that you can use DataInputStream.readInt to read the first four bytes from the file, and then compare the returned int with 0x52494646 (RIFF as a big endian integer)

Kaj
  • 10,862
  • 2
  • 33
  • 27
  • The `checkClassVersion` method in the link is limited to checking Java class files. It is possible to check other file formats in a similar way by reading in bytes and converting/checking against Hex/ASCII values (I managed to do this a while ago but lost the source when a HDD failed). This page below gives a list of possible file signatures as they're otherwise called. endiannesshttp://www.garykessler.net/library/file_sigs.html – James P. May 08 '11 at 15:14
  • Yes, but that means that the file must start with magic bytes, or all file formats do that. – Kaj May 08 '11 at 15:15
  • I realize this and intend to make a more flexible method for another project in future. In the meantime the file formats I'd like to check all (wav, mp3, flv...) all have their magic bytes in the beginning of the file. Endianness becomes an issue if a processor or virtual machine use a different byte order. – James P. May 08 '11 at 15:18
  • 2
    See my 2nd edit and DataInputStream. Endianness is specified in Java and shouldn't be platform specific. – Kaj May 08 '11 at 15:25
  • 1
    Another option is to use ByteBuffer and ByteOrder. You can then specify little indian if that's what you want. You can also do conversions using those classes. – Kaj May 08 '11 at 15:34
  • @Kaj: Ok, I'll have a look at those. Giving this a second thought, endianness is probably not an issue as the contents of, say, a WAV file are most likely identical whatever the platform. Now, here's crossing my fingers that my code as such on a Linux machine :) . – James P. May 08 '11 at 15:56
  • It should work since the wav specification says that the first 4 bytes should be in big endian. – Kaj May 08 '11 at 17:37