-5

I have what might potentially be a very simple problem, but it seems mathematically tricky to me.

I have two lists of values, 400 "terms" and the numbers 0000-9999. That totals 400 * 10000 = 4 million values. I want to construct URL slugs like bird-1415 and neat-1029, etc..

So say I have an unsigned integer between 1 and 4 million. How do I "divide" that number into two parts, such that I get a value between 1-400, and another value between 1-10000?

My attempts: I would think perhaps you first divide the number by 400, no wait, that won't work. I'm not really sure how to do this. I could say the first 400 values lead to 1 one of the 400 words, and the next 10000 lead to another number... no that doesn't do it...

I am flexible on other currently unknown possible constraints to the system, feel free to add any.

But basically, I need to somehow split any number between 1 and 4m into two parts, so that I can select a term from the 400 word list, and a number from the 10k number list, to form words like "rise-1234" and "nice-1039", etc..

I am using a "quadratic residue PRNG" sort of lib function in JS which I borrowed from here, where given an integer 0-n, mapping it through this PRNG function gives you what seems to be a random number, do to some mathematical magic (irrelevant to this post). So basically, I need a 1-400 integer, and 1-10k integer, and then I will plug each into the PRNG function to get a seemingly random integer based on those input integers.

Is there any way to accomplish this? Given a number between 1 and 4m, split it into two integers (not random integers), and then I will plug those 2 integers into the random generator. As we go from 1, 2, 3, ... 4m, the parallel sequence of 2 integers should also increment in a predictable sequence. Just not sure really how that would look (otherwise I would give you some expected outputs), but it's sort of like splitting an integer into an array of values.

Details

One example that is similar is splitting a large integer into an array of 8-bit integers. I'm still not really sure how it works, but it I think grabs the sections of bits of each order of magnitude and returns them in the 8-bit array.

The comment of doing n % 10000 and Math.floor(n / 10000) won't work, because it gives two numbers which are unrelated to each other.

Another way of orienting around this question is thinking in terms of bits. We could say, for the 400 words, maybe let's only use 256, so that is an 8-bit number. Then for the 10k, that is basically 2^13 (8192). I don't care if we use all the values, just most of them would be good enough for generating these random names like hope-1235. So that would be splitting off the first 8 bits of the number, and then splitting off the next 13 bits of the number, assuming our number was just 8192*256 = 2,097,152 possible values. To do that I think you might do something like:

log(255)
log(256)
log(257)
log(20000)
log(20001)
log(2000000)
log(2000001)
log(2097151)
log(2097152)

function splitInt(int) {
  let roughly10k = (int >> 8) & 0xFFFF
  let terms256 = int & 0xFF
  return [terms256, roughly10k]
}

function log(int) {
  const [a, b] = splitInt(int)
  console.log(int, '=> [', a, b, ']')
}

Is there a better way to do this sort of thing than the way I have proposed, that will work similarly to what I just showed yet use all 4m values instead of just the ~2m subset? Did I even do my example correctly?

Lance
  • 75,200
  • 93
  • 289
  • 503
  • 5
    Err... `n % 10000` and `math.Floor(n / 10000)` ? – Ouroborus Jul 04 '23 at 08:34
  • hash the first word and use the number directly. – Nina Scholz Jul 04 '23 at 08:35
  • 1
    I am very confused. Can you give some examples of input and output? – holydragon Jul 04 '23 at 08:37
  • `function getNumberBetween0AndX(x) { return Math.floor(Math.random() * x); }` If you call `getNumberBetween0AndX(10000)` and `getNumberBetween0AndX(400)`, you'll have your two indexes. Right? – Bamak Jul 04 '23 at 08:38
  • *"The comment of doing `n % 10000` and `Math.floor(n / 10000)` won't work, because it gives two numbers which are unrelated to each other."*: they *are* related. `(n % 10000) + 10000 * Math.floor(n / 10000)` is your original `n`. This is how quotient and remainder relate to each other. I don't understand why this is not what you need? – trincot Jul 15 '23 at 23:35

1 Answers1

1

The code

It's quite simple:

function splitInt(int) {
    let idx10000 = int % 10000;
    let idx400 = (int - idx10000) / 10000;
    return [idx400, idx10000];
}

function mergeInts(idx400, idx10000) {
    return (idx400 * 10000) + idx10000;
}

Why this works

The best way to demonstrate why this works is to view the numbers in decimal.

Take a random number between 0 and 399 (for your "terms" list), let's write that number XXX.

Do the same for your second list, so a number between 0 and 9999, let's write it YYYY

call mergeInt(XXX, YYYY) and you will get the number X,XXY,YYY. It's as simple as that.

So if my first index is 9 and my second index is 189, the final number will be 0,090,189 (which can be simplified to 90,189). It's as simple as that.


With this, you are guaranteed that every single possibility of index values for both of your table will give you a single unique number between 0 and 3,999,999.

The other way around, with any number between 0 and 3,999,999, you are guaranteed that it will be split into a unique combination of numbers, with one being between 0 and 399, and the other between 0 and 9999

You mentionned requiring a number between 1 and 4,000,000, for which you can just add 1 to the result you get from the code I have written.


Note

The comment of doing n % 10000 and Math.floor(n / 10000) won't work, because it gives two numbers which are unrelated to each other.

Why would the two numbers need to have any kind of relation? You are storing two separate indices which are completely unrelated, so why would this be wrong?

Besides, the example that uses bitshits and masks that you have provided is does the exact same thing, but in binary (therefore using a power of two as the second index rather than a power of 10 like in your initial question).

My code does not use floor but does pretty much the same thing using a subtraction. Both commenter's code and my code as valid according to the constraints you have given.

If you still believe that this "numbers are unrelated" issue is relevant, then please elaborate, as it makes no sense according to the information you yourself provided.

RedStoneMatt
  • 465
  • 3
  • 11