2

I'm writing a code generator that is replaying events recorded during a packet capture.

The JVM is pretty limited - it turns out. Methods can't be >64KB in size. So I added all kinds of trickery to make my code generator split up Java methods.

But now I have a new problem. I was taking a number of byte[] arrays and making them static variables in my class, e.g.:

public class myclass {
    private static byte[] byteArray = { 0x3c, 0x3f, ...
        ...
    };
    private static byte[] byteArray2 = { 0x1a, 0x20, ...
        ...
    };

    ...

    private static byte[] byteArray_n = { 0x0a, 0x0d, ...
        ...
    };        
}

Now I get the error: "The code for the static initializer is exceeding the 65535 bytes limit".

I DO NOT WANT TO HAVE AN EXTERNAL FILE AND READ IN THE DATA FROM THERE. I WANT TO USE CODE GENERATED IN A SINGLE FILE.

What can I do? Can I declare the arrays outside the class? Or should I be using a string with unicode for the values 128-255 (e.g. \u009c instead of (byte)0x9c)? Or am I the only person in the world right now that wants to use statically initialised data?

UPDATE

The technique I'm now using is auto-creation of functions like the following:

private byte[] byteArray_6() {
  String localString = "\u00ff\u00d8\u00ff\u00e0\u0000\u0010JFIF\u0000" +
    "(0%()(\u00ff\u00db\u0000C\u0001\u0007\u0007\u0007\n\u0008\n\u0013\n" +
    "\u0000\u00b5\u0010\u0000\u0002\u0001\u0003\u0003\u0002\u0004\u0003";
  byte[] localBuff = new byte[ localString.length() ];
  for ( int localInt = 0; localInt < localString.length(); localInt++ ) {
    localBuff[localInt] = (byte)localString.charAt(localInt);
  }
  return localBuff;
}

Note: Java keeps on surprising. You'd think you could just encode every value in the range 0-255 as \u00XX (where XX is the 2-character hex representation). But you'd be wrong. The Java compiler actually thinks \u000A is a literal "\n" in your code - which breaks the compilation of your source code. So your strings can be littered with Unicode escapes but you'll have to use "\n" and "\r" instead of \u000a and \u000d respectively. And it doesn't hurt to put printable characters as they are in the strings instead of the 6 character Unicode escape representation.

Community
  • 1
  • 1
PP.
  • 10,764
  • 7
  • 45
  • 59
  • Can't you declare an empty array and populate it from a static initialiser (which in turn can call methods if it is too long)? – assylias Nov 22 '12 at 10:23
  • See also: http://stackoverflow.com/questions/6570343/maximum-size-of-a-method-in-java – assylias Nov 22 '12 at 10:26
  • @assylias the referred question related to method size - something I've over come (as described in my question) - and this thread is about static variables in a class. – PP. Nov 22 '12 at 10:27
  • @PP. use an external data file. Why not? – John Dvorak Nov 22 '12 at 10:28
  • @Dvorak - because that makes my problem far more complicated than it needs to be. It's possible - but ridiculously unwieldy - and there must be a better way - Hawtin's answer below appears to be an established pattern which I will try. – PP. Nov 22 '12 at 10:32

3 Answers3

4

Generally, you would put the data in a literal String and then have a method which decodes that to a byte[]. toByteArray() is of limited use as UTF-8 wont produce all possible byte sequences, and some values don't appear at all.

This technique is quite popular when trying to produce small object code. Removing huge sequences of array initialisation code will also help start up time.

Off the top of my head:

public static byte[] toBytes(String str) {
    char[] src = str.toCharArray();
    int len = src.length;
    byte[] buff = new byte[len];
    for (int i=0; i<len; ++i) {
        buff[i] = (byte)src[i];
    }
    return buff;
}

More compact schemes are available. For instance you could limit string character contents to [1, 127] (0 is encoded in a non-normalised form for really bad reasons). Or something more complicated. I believe JDK8 will have a public API for Base64 decoding which isn't too bad and nicely standardised.

PP.
  • 10,764
  • 7
  • 45
  • 59
Tom Hawtin - tackline
  • 145,806
  • 30
  • 211
  • 305
  • Interesting idea. So I'd create a separate function which then has the String equivalent of my byte array which returns mystring.toByteArray(); - I'll give this a go. – PP. Nov 22 '12 at 10:25
  • "I believe JDK8 will have a public API for.." -- it's always in the next version ;-) – John Dvorak Nov 22 '12 at 10:34
  • Almost - instead of .toCharArray run a loop that takes buff[i]=(byte)str.charAt(idx) - this will correctly interpret unicode from 128-255 e.g. "\x00ff" – PP. Nov 22 '12 at 10:53
  • @PP I don't see why that would be better. 128-255 will be correctly interpreted. / I believe using `toCharArray` should be slightly faster, particularly when cold (it is being used in class initialisation. Other than method invocation, there's offset and length calculations going on (on many current implementations). `toCharArray` is likely to be compiled anyway, or will need to be compiled anyway. – Tom Hawtin - tackline Nov 22 '12 at 12:03
  • I've tried your provided function twice comparing the result to a byte[] using the Array.equals() function. I was comparing whether your function processed "\n\u00ff\u00ea\u00ea" correctly to return byte[] { 0x0a, (byte)0xff, (byte)0xea,(byte)0xea }. It didn't. It returned [ 10, 0, 0, 0 ] - which leads me to believe it wasn't handling numbers >127. A similar function I wrote and tested used charAt and returned the expected array. Could you please verify that your tests return different results? I cannot tick this answer until I am satisfied that it is correct. – PP. Nov 22 '12 at 15:25
  • @PP That's odd. As if the `return` was inside the `for` loop or something. Can it produce any results with a number in [1, 127] beyond the first byte? ` "\n\u00ff\u00ea\u00ea\n"` or something? (I don't have Java installed on this machine.) – Tom Hawtin - tackline Nov 22 '12 at 15:57
  • @Hawtin "\n\u00ff\u0019\u00ea" also returns [10,0,0,0] – PP. Nov 22 '12 at 16:50
  • 1
    @Hawtin your bug is in the forloop - you increment len instead of i - fix this and I will accept your answer as I have tested this function and it works when the for loop is incrementing the correct variable. Hmm what am I doing? I can edit this myself. I have the power. Time to (ab)use it. – PP. Nov 22 '12 at 16:53
0

declare an arraylist and use a static constructor

Daij-Djan
  • 49,552
  • 17
  • 113
  • 135
0

May by you can use nested classes for storing static arrays. This step is not the best in means of performans, but I think you could get it with minimum changes in your code.

cgi
  • 190
  • 1
  • 8
  • Your answer makes absolutely no sense. Have you thought about this carefully? Maybe consider giving some example code to demonstrate what you mean. Also - have you tested this against the compiler limitations? – PP. Dec 21 '13 at 09:11
  • @PP. You can also get your code refactored by many little perfor-mans. – Kenyakorn Ketsombut Jan 10 '14 at 08:38