Determin the lexicographic distance between two integers

Question

Say we have the lexicographicaly integers 3,5,6,9,10,12 or 0011,0101,0110,1001,1010,1100 Each with two bits set.

What I want is to find the distance(how many lexicographical permutations between them, without doing the actuall permutations) between say 3 and 5 using as few operations as possible.

The distance table is as following

3->5  = 1 or 0011->0101 = 0001
3->6  = 2 or 0011->0110 = 0010
3->9  = 3 or 0011->1001 = 0011
3->10 = 4 or 0011->1010 = 0100
3->12 = 5 or 0011->1100 = 0101

So a function f(3,5) would return 1;

The function will always take arguments of same Hamming weight (same amount of set bits).

No arrays should be used.

Any idea would be great.

Edit

Forgot to mention, for any set bit size(the hamming weight) I will always use the first lexicographical permutation(base) as the first argument.

E.g.

hamming weight 1 base = 1
hamming weight 2 base = 3
hamming weight 3 base = 7
...

Edit 2

The solution should work for any hamming weight, sorry I was not specific enough.

@ks6g10 Hamming is with 2 ms I can't edit your post so fix it, thanks! — Alberto Bonsanto, Nov 25 '12 at 19:50
May the bit-length (4 in your examples) be assumed fix by the algorithm (this would make set-bit-counting algorithms faster)? What is the desired result for f(5,3)? — Levente Pánczél, Nov 25 '12 at 20:01
@Lewyx the bitsize would go from 20->30, but if there is something that can be optimized at compilation time, it could be changed to be static, and alse f(5,3) would return 1, but i would always as shown by my edit use the first lexicographical permutation as the first argument. — 1-----1, Nov 25 '12 at 20:23
This is NOT lexicographic distance. This is just distance of a subset of the integers. — ypercubeᵀᴹ, Nov 25 '12 at 21:12

Egor Skriptunoff · Accepted Answer · 2012-11-25T21:15:07.283

5

Having a number
x = 2^k₁+2^k₂+...+2^k_m
where k₁<k₂<...<k_m
it could be claimed that position of number x in lexicographically ordered sequence of all numbers with the same hamming weight is
lex_order(x) = C(k₁,1)+C(k₂,2)+...+C(k_m,m)
where C(n,m) = n!/m!/(n-m)! or 0 if m>n

Example:

3 = 2⁰ + 2¹
lex_order(3) = C(0,1)+C(1,2) = 0+0 = 0

5 = 2⁰ + 2²
lex_order(5) = C(0,1)+C(2,2) = 0+1 = 1

6 = 2¹ + 2²
lex_order(6) = C(1,1)+C(2,2) = 1+1 = 2

9 = 2⁰ + 2³
lex_order(9) = C(0,1)+C(3,2) = 0+3 = 3

edited Nov 25 '12 at 21:15

answered Nov 25 '12 at 21:07

Egor Skriptunoff

23,359
2
34
64

Would love to use this but, when m is 20 - 30, unfeasible to sort of calculate. – 1-----1 Nov 25 '12 at 22:32
@ks6g10 - All C(n,m) values for 1 – Egor Skriptunoff Nov 25 '12 at 23:17
@EgorSkriptunoff Then it comes to that I can not really use arrays. – 1-----1 Nov 25 '12 at 23:44
@ks6g10 - What constraints are there on your program? Are input number values limited by some constant? What amount of ROM size, RAM size and CPU time do you have at your disposal? My solution can be enhanced to require only O(log(x)) memory and O(log^2(x)) time for calculating lex_order(x). – Egor Skriptunoff Nov 26 '12 at 03:33
@EgorSkriptunoff I am doing the thing on the gpu and want to reduce the amount of memory reads, but I will investigate if this will have any good result. – 1-----1 Nov 26 '12 at 19:36
@EgorSkriptunoff Just wondering, is there a way of doing the reversed way, knowing the position and bit size, getting the integer set it represents? – 1-----1 Jan 05 '13 at 03:22
@ks6g10 - Yes, it is easy reversible if you have all C(n,m) precalculated and stored in a table. Firstly select maximum possible k_m, then k_(m-1) and so on. – Egor Skriptunoff Jan 29 '13 at 00:45

Vaughn Cato · Answer 2 · 2012-11-26T14:59:01.307

If a and b are the positions of the two set bits, with zero being the least significant position, and a always being greater than b, then you can calculate:

n = a*(a-1)/2 + b

and the distance between two values is the difference between the two n values.

Example:

3->12:
  3:  a1=1, b1=0, n1=0
  12: a2=3, b2=2, n2=5
  answer: n2-n1 = 5

To extend this to other hamming weights, you can use this formula:

n = sum{i=1..m}(factorial(position[i])/(factorial(i)*factorial(position[i]-i)))

where m is the hamming weight, and position[i] is the position of the i'th set bit, counting from the least significant bit, with the least significant set bit's position being position[1].

Should the i* actually be factorial(i)*? – Peter de Rivaz Nov 25 '12 at 22:11 — Peter de Rivaz, Nov 25 '12 at 22:11

Determin the lexicographic distance between two integers

2 Answers2

Linked