1

I have data as lists with length 3 to 25, using ascii (as integers). Example, x=[3,56,43,96,23]. Suppose I apply a sort so that y=sorted(x). What is the most efficient way to remember (like store a single byte) the sorting?

To try and be more clear. There are only 5! = 120 bijections from x to itself. Hence, I only need 2**7 bits (which is less than a byte) of information to remember any one sorting method. However it is unclear exactly how to efficiently map sortings to their binary representation.

Lets try to make this more precise, assuming x is length 5. What is the best way to write functions f(x) and g(y,b), where the output of f(x) is byte, b, such that if g is given the sorted version of x (which is y) and the byte, b, then g will output x again. Like,

x = [3,56,43,96,23]
y = sorted(x)
b = f(x)
print(x==g(y,b)) #Should print True. 
Bobby Ocean
  • 3,120
  • 1
  • 8
  • 15
  • 2
    Is this a real problem you are solving here, or largely theoretical as an exercise in optimization? If the former, do you genuinely have a performance issue with sorting 3 to 25 items using the native sorting options? – jarmod Dec 01 '20 at 23:43
  • Please edit the code of your attempt to solve this into your question. – DisappointedByUnaccountableMod Dec 01 '20 at 23:57
  • 1
    @barny I am unsure what you are asking? – Bobby Ocean Dec 02 '20 at 00:04
  • 1
    @jarmod I am not sure what you mean, this more of a storage question, not a speed question. I would like to save (with as little data as possible) the particular sorting that occurred. Saving the entire original data set is the extreme least optimal solution. – Bobby Ocean Dec 02 '20 at 00:04

4 Answers4

2

Update

Using itertools.permutation is obviously inefficient in terms of memory and time to search. To resolve that, you could compute the factorial/inverse factorial representation directly using functions such as those described in this answer:

def rank(n, p, q):
    if n == 1:
        return 0
    s = p[n-1]
    p[n-1], p[q[n-1]] = p[q[n-1]], p[n-1]
    q[s], q[n-1] = q[n-1], q[s]
    return s + n * rank(n-1, p, q)

def unrank(n, r, p):
    if n > 0:
        p[n-1], p[r % n] = p[r % n], p[n-1]
        p = unrank(n - 1, r // n, p)
    return p

def f(x):
    # sort x including indexes
    s = sorted(enumerate(x), key=lambda y:y[1])
    # extract the indexes to a list
    p = [t[0] for t in s]
    # find the factorial number of the tuple
    q = [0] * len(p)
    for i, v in enumerate(p):
        q[v] = i
    return rank(len(p), p, q)

def g(y, b):
    # find the matching permutation
    p = unrank(len(y), b, list(range(len(y))))
    # zip the permutation with the values
    z = list(zip(y, p))
    # sort by the order
    s = sorted(z, key=lambda y:y[1])
    # and return the values
    return [t[0] for t in s]
    
x = [3,56,43,96,23]
print(x)
b = f(x)
print(b)
y = sorted(x)
print(y)
o = g(y, b)
print(o)

Output:

[3, 56, 43, 96, 23]
108
[3, 23, 43, 56, 96]
[3, 56, 43, 96, 23]

Original answer

You could solve this problem using itertools.permutations to figure out all of the permutations of len(x) values and returning the index into that list that matches the sorted order of elements. You can then reverse that process in the g function, finding the permutation that matches the index and re-sorting y based on that value:

import itertools

def f(x):
    # sort x including indexes
    s = sorted(enumerate(x), key=lambda y:y[1])
    # extract the indexes to a tuple
    i = tuple(t[0] for t in s)
    # find all the permutations of length len(x)
    p = list(itertools.permutations(range(len(x))))
    # find the sorted tuple in the list
    return p.index(i)

def g(y, b):
    # find all the permutations of length len(y)
    p = list(itertools.permutations(range(len(y))))
    # zip the permutation with the values
    z = list(zip(y, p[b]))
    # sort by the order
    s = sorted(z, key=lambda y:y[1])
    # and return the values
    return [t[0] for t in s]
    
x = [3,56,43,96,23]
print(x)
b = f(x)
print(b)
y = sorted(x)
print(y)
o = g(y, b)
print(o)

Output:

[3, 56, 43, 96, 23]
20
[3, 23, 43, 56, 96]
[3, 56, 43, 96, 23]
Nick
  • 138,499
  • 22
  • 57
  • 95
  • I will need to digest this rank and unrank function a bit more. But it seems to work great. Thanks. – Bobby Ocean Dec 02 '20 at 01:40
  • @BobbyOcean if you follow the link to the answer with the algorithm I used, it has a link to a paper describing how the `rank` and `unrank` algorithms work in detail. Btw you can test the algorithm with a simple loop on all possible permutations of an `n` element array e.g. `for x in itertools.permutations(range(5)): x = list(x)` as that will test all the possible sort orders. – Nick Dec 02 '20 at 01:47
1

Here is the same solution as Nick, but I tried to save memory by avoiding to generate lists and keep using generators:

from itertools import permutations


def f(scrambled_list):
    ind, sorted_list = zip(*sorted(enumerate(scrambled_list), key=lambda x: x[1]))
    for index, perm in enumerate(permutations(range(len(scrambled_list)))):
        if perm == ind:
            return index


def g(sorted_list, pos):
    for index, perm in enumerate(permutations(range(len(sorted_list)))):
        if index == pos:
            break
    else:
        raise Exception('Could not find permutation')
    ret, _ = zip(*sorted(zip(sorted_list, perm), key=lambda x: x[1]))
    return list(ret)


x = [3, 56, 43, 96, 23]
print(x)
y = sorted(x)
print(y)
b = f(x)
print(b)
print(g(y, b))
print(x == g(y, b))  # Should print True.

Output:

[3, 56, 43, 96, 23]
[3, 23, 43, 56, 96]
20
[3, 56, 43, 96, 23]
True
Frodon
  • 3,684
  • 1
  • 16
  • 33
0

I'm not sure why others are using permutations to solve this. It sounds like a bytes question. I also think you meant output x[pos] instead of output x.

#! /usr/bin/env python3
x = [ 3, 56, 43, 96, 23 ]

bits  = 0  ##  minimum number of bits to index list
while 2 **bits < len( x ):
    bits += 1

y  = []  ##  reverse nums
for ind, val in enumerate( x ):
    y .insert( 0, val )
y  = bytes( y )  ##  store as hex

def g( y, b ):  ##  count backwards, to find position in byte sequence
    return y[ len(y) -1 -b ]

binary_position = b'0'
print(  g( y, int( binary_position ) )  )

3


Edit:

subtract 1 from return value, it's a 0-based list. used to Lua

Doyousketch2
  • 2,060
  • 1
  • 11
  • 11
0

It's possible to describe the order by giving the insertion points of the items if they are inserted in sorted order. The first item is just inserted because there are no options. Next item has two options, before or after the first item. Next has three possible insertion point and so on.

This list of indexes can then be encoded into one integer which maximum value is n!. Start with the index that has two possible indexes, 0 or 1. Multiply with the next index number of possibilities that is 3 and then add the index. Continue with all indexes. This is very easy to decode using modulus and integer division.

def f(x):
    # sort in reversed order
    y = sorted(x, reverse=True)
    # don't destroy original list, make a copy
    x = x[:]
    insert_points = []
    # the last item is obvious so dont iterate over it
    for v in y[:-1]:
        # find the indexes of biggest integer
        i = x.index(v)
        insert_points.insert(0, i)
        # delete to get correct index of next item
        del x[i]
    # encode insertion points to single integer
    b = 0
    for i, p in enumerate(insert_points):
        # multiply by the last index number of possibilities and add next
        b = b*(i+2)+p
    return b

def g(y, b):
    # decode insertion points into a list
    insert_points = []
    for i in range(len(y),1,-1):
        insert_points.insert(0, b % i)
        # divide by this index number of possibilities
        b = b // i
    # create sorted list
    # first item is obvious
    sorted_list = [y[0]]
    for i, p in enumerate(insert_points):
        # inserting items according to given indexes
        sorted_list.insert(p, y[i+1])
    return sorted_list
Jolbas
  • 757
  • 5
  • 15