0

Possible Duplicate:
Programming Logic: Finding the smallest equation to a large number.

I'm looking for an algorithm that will take an arbitrary number from the Aleph-Null set (all positive integers)(likely to be absolutely enormous) and attempt to simplify it into a computable number (if the computable number takes up less space than the integer value it is trying to represent)(specifically not floating point). Involving tetration/hyperoperators would be optimal.

Does anyone know if anything like this exists? I've looked around quite a bit this morning, but have been unable to find anything. C# code would be optimal, but really, it could be in any language

Edit: Programming Logic: Finding the smallest equation to a large number : http://mrob.com/pub/ries/index.html looks promising, but I wonder how well it will deal with large numbers, and if it's capable of implementing hyperoperators. I'll try it out.

Community
  • 1
  • 1
  • 6
    So you want to take something that can be arbitrarily large, and compress it reversibly down to a finite size? That, by the pidgeonhole principle, is impossible. (By the way, writing "natural numbers" or even "positive integers" instead of "the Aleph-Null set" is less likely to scare people.) – Thomas Aug 09 '10 at 17:57
  • What do you mean by "will take an arbitrary number"? Do you think about some input (from user or file) or some other way? – Maciej Hehl Aug 09 '10 at 17:58
  • 4
    Duplicate of http://stackoverflow.com/questions/3409363. In short, this is called Kolmogorov complexity of a number and is undecidable. Also see http://stackoverflow.com/questions/1539286/ – sdcvvc Aug 09 '10 at 18:05
  • 2
    @Thomas: no, he wants to compress it *if compression is possible*. And presumably indicate failure if not. The pigeonhole principle doesn't rule this out. A clever diagonal argument (see http://en.wikipedia.org/wiki/Kolmogorov_complexity) shows that you can't write a program to solve this problem *optimally*, assuming that the descriptive language that you're compressing into is rich enough to describe general computation. But (for example) `gzip` solves it partially, and far enough to be useful in practice :-) – Steve Jessop Aug 09 '10 at 18:14
  • Right, the worst case for something like RAR or ZIP is to pack the file in as-is. The best case is considerably better, so the average case is typically a win. – Steven Sudit Aug 09 '10 at 18:20
  • 1
    ... so any real answer is going to be of the form, "here are some heuristics to throw at the problem". But just for a simple sub-problem, we know that factorisation is computationally infeasible for "absolutely enormous" numbers, so if multiplication is part of the allowed language, that's a whole lot of possible compressions we're going to miss in practice. So my intuition is that there aren't any solutions that look really good in the grand scheme of things. Which doesn't mean it's not a studied problem, probably by number theorists rather than programmers. – Steve Jessop Aug 09 '10 at 18:33
  • @sdcwc: thanks, that is exactly what I was looking for @Steve & Thomas: yeah, there won't be any pigeon-holing. This would effectively be a lossless compression. Thanks, I'll check out the gzip algorithm. – Nicholas Baldwin Aug 09 '10 at 19:03
  • @Steven To nitpick, the average case - over all possible inputs - is at best breakeven. The average case over data you're actually likely to want to compress is much better, of course. :) – Nick Johnson Aug 09 '10 at 20:11
  • @Nick: You're not wrong. The only reason we do better than breaking even is that most of the data we happen to care about has obvious patterns and redundancies. That's why even the wimpiest of encryption schemes (XOR!) is enough to force ZIP to store it uncompressed. – Steven Sudit Aug 09 '10 at 20:50

1 Answers1

1

(all positive integers) and attempt to simplify it into a computable number (if the computable number takes up less space than the integer value it is trying to represent)(specifically not floating point). Involving tetration/hyperoperators would be optimal.

Yes, and then again, no.

First, you can't actually take inputs from "all positive integers" in a physical computer. At best, you can have an integer whose representational length is the size of your hard drive.

So your input is now physically constrained to the set I = [0, MAX], where MAX is a physical constant. Congratulations, that makes this problem solvable.

You can consider this from an information-theoretic point of view- each member of I is possible and representable. The compressability comes in when you consider representations. If each representation is unique, your goal is to reduce each i in I to the representation that is nearest the entropy of the number of i itself.

Or, restated, compression comes in by removing redundancy. If your representation has redundancy, it can be compressed.

Possibly - this would be domain knowledge - you can write the formula for generating the number in a fashion that is highly compressed. But that relies on a certain regularity in how you get the number, it becomes no longer arbitrary.

Paul Nathan
  • 39,638
  • 28
  • 112
  • 212
  • well, this is as much a theoretical investigation of mine, as it is an applied one. I would like to find an algorithm that could theoretically produce an equation for any number; but true, it is limited by the hardware. I'm not looking for compression algorithms that relies on regularity for compression, I'm looking for an algorithm that will produce an equation that can then be interpreted to reproduce the number. The assumption is that the number does not have any regularity. – Nicholas Baldwin Aug 09 '10 at 19:05
  • @Nich: No regularity -> no compression (on average). – Steven Sudit Aug 09 '10 at 20:52
  • @Nich: Information theory *applies* to compression, but is broader. You are definitely asking for something which lies in the region of computing Kolmogorov Complexity for arbitrary strings, which is not computable for the general case. You should spend some time investigating the entropy of arbitrary integers in different representations, which will give you some reasonable notions of minimal representability for your numbers. – Paul Nathan Aug 09 '10 at 21:17