9

I was at a programming competition last weekend (a qualifier round for the ICPC) and I was trying to do a problem that involved (among other things) factoring integers in the range 0 <= i <= 1,000,000 into primes.

My first thought was "Oh no, factoring integers is in NP, we need to find a way to avoid spending time on it." We were working in Java. Being primarily a C programmer, my first instinct was to precompute all of the primes up to 1,000,000 with a script, format it as a Java array, and attempt to insert it into my code. My reasoning was that it save us the time of running something like a Sieve of Eratosthenes during the timed section, cut down the time complexity by something like a factor of n, and breeze through it.

I was then hit with the "code too large" error and my few-hundred-thousand int array was rejected by the compiler. Because of the rules of the competition, it was not possible to read it in to a file, or store it anywhere except the .java file containing the class that had a main method.

I tried breaking it up into something like 100,000 int chunks but that didn't work because it was still too large (we had nothing to consult except for the Java 7 documentation and I wasn't able to find anything about the "code too large" error message in it). We eventually gave up, tried another method, and got it working, but it ended up costing us something like 1/4 of our total time on the competition and harmed our score significantly.

So, my question is: was there any sane way to solve this problem with a lookup table? Is it just impossible to get a huge pre-compiled lookup table into a Java program?

I'm also curious about the reasoning behind this restriction of the language... is it a security thing, somehow? Why would a compiler limit the bytecode size of methods?

Patrick Collins
  • 10,306
  • 5
  • 30
  • 69
  • @AndyG -- I'm not very clear on why you made that edit -- this is a Java specific question. – Patrick Collins May 20 '14 at 19:26
  • 2
    It is because there's a java tag in the question. No need to be in title as well. – Luiggi Mendoza May 20 '14 at 19:27
  • Just to know, how many lines of code has your Java program? – Luiggi Mendoza May 20 '14 at 19:29
  • @LuiggiMendoza it was something like 100, maybe. The problems are mostly intended to be done quickly. I think that there was a specific algorithm they were expecting you to use that yielded a quick solution, but we didn't know it so we ended up blundering through a fairly convoluted alternative. – Patrick Collins May 20 '14 at 19:31
  • Now we know why your code was *too large*. You should rewrite your method, basically by splitting it into shorter methods. – Luiggi Mendoza May 20 '14 at 19:34
  • @StilesCrisis I read that question, but I don't think it applies. All of the solutions are something like "read it in from a file" or "compute it in a loop," neither of which were possible in this circumstance. I'm interested in knowing if there is any way to get this kind of thing done in Java, given the very specific constraints we had to deal with. – Patrick Collins May 20 '14 at 19:34
  • Just shorten your method. That's all. – Luiggi Mendoza May 20 '14 at 19:34
  • @LuiggiMendoza That was not possible here -- I'm asking if there's a way to compile a 400,000-member integer array. My actual code was sane and modular, but I needed to squeeze in an array of hundreds of thousands of `int`s to solve the problem. – Patrick Collins May 20 '14 at 19:35
  • And the accepted answer in that question explains it: **you can't**. – Luiggi Mendoza May 20 '14 at 19:35
  • You could make a bunch of smaller methods, each with less than 64K of bytecode, which populate a hash table, then look up from your hashtable. Or create a bunch of methods containing smaller switch blocks (then it's your job to optimally figure out which method to branch to). – StilesCrisis May 20 '14 at 19:36
  • Would this restriction also apply to class fields, i.e. a `static int[]`? – awksp May 20 '14 at 19:36
  • Note also that your switch will likely be MUCH smaller if you have the cases be "ints which are prime" and let "ints that are NOT prime" just fall out the bottom. – StilesCrisis May 20 '14 at 19:37
  • @StilesCrisis I'm very interested in hearing about that solution... I think that this question is sufficiently different from the previous one, since it invites that solution rather than the fairly trivial ones in the question you have marked this as a dupe of. – Patrick Collins May 20 '14 at 19:38
  • Also also, very important: You don't need all of the primes up to 1,000,000, only the ones up to `sqrt(1,000,000)` = 1,000. –  May 20 '14 at 19:38
  • @duskwuff Not true -- that would miss large primes. – Patrick Collins May 20 '14 at 19:38
  • @PatrickCollins Did you try putting the `int[]` as a `static` variable or something? That wouldn't be counted as "in a method"... – awksp May 20 '14 at 19:39
  • @user3580294 Yes, my issue was caused by a `static int[]` array. – Patrick Collins May 20 '14 at 19:39
  • I think getting any more specific would just be doing the competition challenge for you. Overcoming these obstacles is why a competition exists. – StilesCrisis May 20 '14 at 19:39
  • @PatrickCollins: You don't need to check for those. They'll be recognized naturally, because they won't be divisible by any of the primes under 1000 that you test with. –  May 20 '14 at 19:39
  • @StilesCrisis The competition has been over for a week -- I'm just asking to satiate my curiosity about Java. The ICPC is done over a five hour period, in person, sitting in front of a machine. – Patrick Collins May 20 '14 at 19:40
  • @PatrickCollins Huh, that's interesting... So I guess that's something the "duplicate" answer didn't hit. – awksp May 20 '14 at 19:41
  • @user3580294 from the accepted answer: *A single method in a Java class may be at most 64KB of bytecode.* With this in mind, you already know what the problem is, and the first obvious solution: **make the code shorter**. This can be done by splitting the current method into smaller methods called one by one. There's no need to have great knowledge on Java or programming to solve it. – Luiggi Mendoza May 20 '14 at 19:43
  • @user3580294 It's still in a method, specifically one called [``](http://docs.oracle.com/javase/specs/jvms/se8/html/jvms-2.html#jvms-2.9). Even constant data for a `static final` array is nothing more than a dumb series of assignment statements, one for each element of the array, in ``. – Boann May 20 '14 at 19:43
  • @duskwuff That's true... I think the limit is 500,000 though, right? Because the largest prime you need to deal with is 2 * x = something slightly less than a million. – Patrick Collins May 20 '14 at 19:45
  • @LuiggiMendoza I was interested in a way that's more elegant than `public static getPrimesBetween1000and5000(int i)`. I also asked for the rationale behind this question -- i.e. what benefits does it give to the compiler? But oh well, I'll move on. – Patrick Collins May 20 '14 at 19:50
  • @PatrickCollins you may post a new question asking *how to improve this code* and show your real code to be analyzed and evaluated there. By the current state of this question, it is a fair duplicate of the possible duplicate Q/A. – Luiggi Mendoza May 20 '14 at 19:52
  • Isn't that better to post on codereview than on stackoverflow? – StilesCrisis May 20 '14 at 20:13
  • @PatrickCollins: The maximum prime you need to worry about is the square root of the maximum value (so that `x * x <= maximum`). If a prime larger than that is a factor, it has to be paired with a smaller prime. (For instance, 851981 has a factor of 65537, but the other factor is 13, so you don't need to divide by 65537 specifically.) –  May 20 '14 at 21:14
  • @Boann Ah, so class initiation is treated as a method? Thought that it would have been compiled into the class itself as a constant... Guess not. Thanks for the info! – awksp May 20 '14 at 22:04
  • 2
    1. Computing the Sieve of Eratosthenes till 1 million is so damn fast that I wouldn't care. 2. **store it anywhere except the .java file containing the class** So store it there as a comment and read it, this would compile. – maaartinus May 21 '14 at 01:30
  • duskwuff and maaartinus both of those are brilliant, thank you for the help. I would love to accept one as an answer and give you some rep, but.... – Patrick Collins May 21 '14 at 01:36
  • factoring all integers in 0-1_000_000 is an almost-free side effect of the sieve (worst case, you get all the prime factors but one), which, as pointed out by @maaartinus, is very fast. – njzk2 Jan 21 '15 at 19:09

0 Answers0