3

I am an Ubuntu user, I have a lot of .class files in which ground truth information about a dataset I'm using is stored. I want to access this information (not modify it) and export it to .csv files to use it more easily.

I tried many methods to decompile and access the content of those .class files:

  • javap -c

Error: error while reading constant pool for synthPlate027.class: unexpected tag at #1: 51

  • Install and open the file via JD-GUI. JD-GUI doesn't react to the files, it doesn't open them but doesn't show any error message.

  • I tried ALL the different java decompilers from the online decompiler http://www.javadecompilers.com/, all failed and showed a different error message

  • I finally used http://www.decompiler.com/, and it happened to work fine! This proves that the data isn't corrupted.

Output from http://www.decompiler.com/ (which is coherent with the data expected) :

    352, 608, 1
96, 224, 1
160, 608, 1
96, 544, 1
96, 160, 1
160, 96, 1
160, 288, 1
224, 160, 1
96, 416, 1
288, 608, 1
32, 288, 1
480, 352, 1
32, 96, 1
288, 224, 3
96, 480, 1

How is that possible? And how can I access the data of those files (>1000 of them) in an efficient and automated way?

Michael M.
  • 10,486
  • 9
  • 18
  • 34
AndroDev
  • 33
  • 6
  • Are you able to disassemble them with Krakatau? https://github.com/Storyyeller/Krakatau – Antimony Jun 05 '20 at 01:20
  • @Antimony , I've just tried but obtain an ```IndexError: list index out of range``` – AndroDev Jun 05 '20 at 01:37
  • Do you have any idea on why all the JAVA decompiler failed except the online one mentioned above ? – AndroDev Jun 05 '20 at 02:57
  • Are these commercial files that are perhaps protected in some way? – NomadMaker Jun 05 '20 at 05:57
  • Check the link https://stackoverflow.com/questions/23217891/decompile-a-class-file-programmatically – Rajkumar Jun 05 '20 at 06:33
  • @NomadMaker Those files are not from me but shouldn't be protected in any way. Moreover, it would be weird that http://www.decompiler.com/ was able to disassemble them if they were so. – AndroDev Jun 05 '20 at 09:01
  • @Rajkumar thanks for your link ! I'll give it a try, but the proposed idea are to use a library using procyon decompiler ,which I tried throughout http://www.javadecompilers.com/ and it fails, so I don-t expect it to work throughout the library. – AndroDev Jun 05 '20 at 09:04
  • @Antimony As you implemented Krakatau, lmk if that makes sense for you : File "disassemble.py", line 75, in disassembleSub(readFile, out, targets, roundtrip=args.roundtrip,outputClassName=False) File "disassemble.py", line 35, in disassembleSub clsdata = ClassData(Reader(data)) File "Krakatau/classfileformat/classdata.py", line 104, in __init__ self.pool = ConstantPoolData(r) File "Krakatau/classfileformat/classdata.py", line 17, in __init__ self._const(r) File "Krakatau/classfileformat/classdata.py", line 23, in _const t = TAGS[r.u8()] IndexError: list index out of range – AndroDev Jun 05 '20 at 09:10
  • 2
    Without an example file, it is impossible to say what is going on. That “kind of data” you have posted, doesn’t help. That’s not different to saying that the class contains a `"hello"` string. In fact, saying that there was a a `"hello"` string would bear *more* information. You posted some numbers without any context. – Holger Jun 05 '20 at 11:58
  • @Holger Thank you for your note. Those files are not mine and I am not aware on how they are constructed. I gave some numbers without context not for you to make sense of it, but to show the output from http://www.decompiler.com/ that is coherent. The question is how http://www.decompiler.com/ could disassembly the files while all other decompiler cited above failed and show errors. I would be happy to share one of the .class file, but it's not possible throughout a post. – AndroDev Jun 05 '20 at 12:49
  • 4
    But what you’ve posted, doesn’t even remotely look like a Java class. It’s a bunch of numbers. If that’s truly the output of the decompiler, then the file just wasn’t a class file at all. You could post a hex dump, even limited to, say the first hundred bytes, would be helpful. – Holger Jun 05 '20 at 12:57
  • 2
    Yes, it looks to me like this "class" file is just a CSV file somebody renamed to ".class". –  Jun 05 '20 at 13:00
  • @Holger and Taschi You guys are right... I don't know why the hell the creator of the dataset I'm using saved the ground truth in .class files... it's indeed just numbers and I could treat them as csv... Sorry, I'm not used to Java and didn't have the intuition that a class file output couldn't be only number. Thanks for your time and answers – AndroDev Jun 05 '20 at 13:23
  • 2
    To be fair, I’m also surprised that *none* of the tools told you that this is not a class file. Each real class file starts with a four byte “magic number”, followed by a version number. So, recognizing that this is not a class file, is easy. Even if a particular tool offers to try to parse the file anyway, it should at least give you a warning. I tried myself and indeed, even `javap` blatantly ignores the mismatches and produces the error you have posted. If I knew that the tools behave that way, I came up earlier with this (now looking trivial) solution… – Holger Jun 05 '20 at 13:38
  • 1
    @Holger Come to think of it, I should probably add a magic number check in Krakatau. It doesn't really make sense to try to blindly disassemble a file if it doesn't have the magic number and would help catch stupid mistakes like this. – Antimony Jun 06 '20 at 02:42

0 Answers0