0

I've reviewed the sorting algorithms I can found, but I didn't see any kind of algorithm uses bits of numbers. I think, I create a new type of sorting algorithm. I named Bitsort. Descriptions are in github.


Do you know any sorting algorithm like this?


Complexity is O (nk). k is bit size of an element of array. Array order is not important. Every time complexity is O (nk). But it using a little bit much memory. It depends on N. But when N is increasing, memory is decreasing relatively. if N is 1, it is maximum memory ratio(R = Node/N)(=bitsize). if N is max, memory ratio decreasing to R = 2. So R*N is how many node is neccessery to store whole bit tree. If N is equal maximum ->(2^32 for integer) we need 2N node to store all array. Every node has 2 adress pointer.

Meanwhile N is not count of numbers in array. N is unique count of numbers.

if all elements in array is same, N is equal 1.

Summarize

    Memory = P*N*R (P: pointer size, N: unique count, R: NodeCount(C)/N)


    I create a formula for R for 32 bit integer.
    R = 31 - 3.3*LOG10(N)

I put the numbers to binary tree from MSB(most significant bit) to LSB(Least significant bit). If they was added before, I am increasing the count of the leaf of value.

I only 1 time move on source array from begin to end and "sorting tree" has being filled.

void bitSort(int * array, int arraySize) {
    int i, j;
    Block* block;
    clock_t start, end;

    //create a buffer. root node is first node in buffer. 
    root = initBlockBuffer();
    const unsigned long long digit = ((unsigned long long) 1) << (ARRAY_ELEMENT_TYPE_SIZE_1);

    //for every array element (n !IMPORTANT)
    for (i = 0; i < arraySize; i++) {
        // start at root
        Block* activeBlock = root;

        register int value = array[i];
        register int bit;

        //for every bit of value (k !IMPORTANT)
        for (j = 0; j < ARRAY_ELEMENT_TYPE_SIZE_1; j++){

            //from msb to lsb get the bit   
            bit = (digit & (value << j)) >> (ARRAY_ELEMENT_TYPE_SIZE_1);

            // get the related node from bit 
            block = activeBlock->node[bit];


            // if the node is not exists
            if (block == 0) {

                // get next blank node from the buffer.
                block = nextFreeBlock();

                //connect new node to previous node  
                activeBlock->node[bit] = block;
            }

            // jump to new node.
            activeBlock = block;
        }

        //after all last node is leaf. 
        //Getting from last bit of value
        if(activeBlock->cnt[value & 1] == 0) leafCount++;  
        //and count of this leaf, increasing 1
        activeBlock->cnt[value & 1]++;  

    }
}


Some results


if we create an array has 1000000(one million) 4bytes integer number as: - Same number - Increasing from 1 to 1000000 - Random uniform distribution


Same numbers

Leaf Count    : 1
Node Count    : 31
Node Size     : 16 byte
Total Memory  : 512 byte
Duration Sort : 178721 us (0.02s)
Duration Read : 4994 us (0.005s)


Increasing from 1 to 1000000

Leaf Count    : 1000000
Node Count    : 1000018
Node Size     : 16 byte
Total Memory  : 16000304 byte (16MB)
Duration Sort : 218556 us (0.2s)
Duration Read : 14321 us (0.01s)


Random uniform distribution

Leaf Count    : 999768 (uniq numbers, >%0,02 repetition)
Node Count    : 11181318
Node Size     : 16 byte
Total Memory  : 178913456 byte (179MB)
Duration Sort : 1460578 us (1.4s)
Duration Read : 666933 us (0.7s)


Is there an algorithm like that you know?


Example

Example: We suppose that we have 3 bit length numbers.

array = {7, 3, 2, 5, 0, 7, 3, 2, 7};

L   : level
msb : most significant bit
lsb : least significant bit

              msb       lsb
               L1   L2   L3
7 = 111  -->    1    1    1
3 = 011  -->    0    1    1
2 = 010  -->    0    1    0
5 = 101  -->    1    0    1
0 = 000  -->    0    0    0
7 = 111  -->    1    1    1
3 = 011  -->    0    1    1
2 = 010  -->    0    1    0
7 = 111  -->    1    1    1

firstly binary tree has only root node.

                        0_____________________|_____________________1                          
                       /                                             \                         

first number is added to binary tree using own bits from msb to lsb. (adding number: 7 =>(111)(3bit space))

             0_____________________|_____________________1                          
L1  ----->                                                \                         
                                                 0_________\_________1              
L2  ----------------------------------------->                        \             
                                                                   0___\___1        
L3  ---------------------------------------------------------->             \       
                                                                            [1]     

Then others sequently. (adding numbers: 3, 2, 5, 0)

                        0_____________________|_____________________1                          
                       /                                             \                         
            0_________/_________1                           0_________\_________1              
           /                     \                         /                     \             
      0___/___1               0___\___1               0___/___1               0___\___1        
     /                       /         \                       \                       \       
   [1]                     [1]         [1]                     [1]                      1     

if a number is already in tree, its count is increased 1. (numbers: 7, 3, 2, 7)

                        0_____________________|_____________________1                          
                       /                                             \                        
            0_________/_________1                           0_________\_________1              
           /                     \                         /                     \            
      0___/___1               0___\___1               0___/___1               0___\___1        
     /                       /         \             /         \             /         \       
   [1]                     [2]         [2]                     [1]                     [3]     

When it is read recursively, can be get sorted array.

sorted_array =  [1x(000), 2x(010), 2x(011), 1x(101), 3x(111)]
sorted_array =  [1x0, 2x2, 2x3, 1x5, 3x7]
sorted_array =  [0, 2, 2, 3, 3, 5, 7, 7, 7]
yuempek
  • 1
  • 2
  • 8
    I think you are looking for radix sort – AndyG Jul 26 '19 at 13:40
  • 2
    I agree with @AndyG, you seem to have (re-)invented a kind of radixsort. Yes, most good ideas have already been thought by someone else before... – Ctx Jul 26 '19 at 13:45
  • 1
    Granted, binary radix sort is kind of hard to find if you are not looking for it. – AndyG Jul 26 '19 at 13:48
  • Possible duplicate of [What is the difference between bucket sort and radix sort?](https://stackoverflow.com/questions/4461737/what-is-the-difference-between-bucket-sort-and-radix-sort) – Prune Jul 26 '19 at 17:02
  • I don't agree with you. I think, only their intersect aspects are that they are using digits if we say numbers are in base 2. Nothing else. If we say numbers are in base 2, yes it seems a bit radix or bucket. But is not. I am using a tree and bitwise operations. but they are not! Also to sort, I am just putting the numbers to tree. Any number is not compared another one directly. Just, a number is flowing on tree from root to related leaf every loop. When every number found their location, array is sorted. – yuempek Jul 29 '19 at 12:34
  • The tree itself is a [bitwise trie](https://en.wikipedia.org/wiki/Trie#Bitwise_tries). Maybe this will help you come up with a satisfying name for your algorithm. –  Jul 29 '19 at 13:06
  • @WumpusQ.Wumbley, you are absolutly right! Thank you, I didn't know that. This information is so important for me. Name of algorithm could be Binary Trie Sort, but I belive that the name is important to remember and acceptance. So I want to use Bit Sort instead of Trie Sort. But yes this a trie sorting algorithm you are right. – yuempek Jul 31 '19 at 07:40

1 Answers1

0

Well you are wrong. You have designed a sorted binary tree (which is not your invention). Indeed you don't need to have all the keys (as you do in your examples) to complete the tree. As such it has average running tree of O(n*log(n)), as each insertion needs to cover all the tree nodes upto the place where you are going to put the key. Also, if you get an already ordered set, the tree degenerates in a list (you always go to the right branch until you reach the place of insertion) degenerating into O(n²).

Also, in case you are going to consider every key in the set (as you do in all your samples), you can save pointer space by just implementing an array of size 2^n, and considering that left son of A[n] is cell with index 2*n, and the right son the cell with index 2*n+1. In that case you have only to store the data associated to the key and not the pointers. You'll lead to this approach also if you consider the possibility of building a multidimensional array (as many dimensions as bits you have) and each index going from 0 to 1. In that case, considering the array full sized, you can store directly the data into the array cell linearly, considering the array as a linear array.

Luis Colorado
  • 10,974
  • 1
  • 16
  • 31
  • This is not a binary sort tree. Nodes are not a number or not includes a number. This is a 1-0 tree. every node in the tree can be just **bit**. max depth is not related n(number of array). max depth is constant as k(bitsize of the number). So O(n*log(n)) is not true. O(nk) is correct complexity of my algo. so I am not agree with that : `... (you always go to the right branch until you reach the place of insertion) degenerating into O(n²)` I am using extendible buffer. Buffer includes blocks point to another block until leaf. Leaf is a "count" of value defined by the whole branch. – yuempek Jul 29 '19 at 11:45
  • then you cannot talk about O(n*k) because preciselly that is `k = log(number of keys)` and that is the maximum number of keys you can have stored... your k is **always** greater that my `log(n)` as n (the number or keys stored in the array) is always less than the key space cardinality. The cardinality of the key space is the maximum number of elements you can have in your tree... while a binary equilibrated tree (an avl tree) has maximum log(n) depth, where depth is proportional to the log(number of elements stored in the tree) and never the maximum log(number of keys available) – Luis Colorado Jul 29 '19 at 16:01
  • in the best case, you'll get a perfectly equilibrated tree, as you put keys in, and that means you get O(n*log(n)) (well, if you have a keyspace of 32bits, you get a maximum O(n*log(2^32)) ~= O(n*32) :) But for that, the tree must be equilibrated, thing that is done in the avl trees (or red/black trees). Think on it and you'll see they are the same (except for your lack of equilibrium) You are buliding a perfectly equilibrated tree... in that case, but that requires you to know the cardinality of the key space, and the problem is already resolved for the avl tree. – Luis Colorado Jul 29 '19 at 16:08
  • Anyway, in the average case, you get O(n*log(n)) which is the same as quicksort. (the constant `k` you use in your O() function is --- you say that, and it is true--- proportional to the log(number of keys in keyspace) which is greater than log(number of elements in the container. – Luis Colorado Jul 29 '19 at 16:10
  • (after a while =)) yes you're right, as you say, k >= log(n) if we store only 1 element in leafs. but I able to store **more** than 2^k number in tree. For example, lets think about 3 bit space of n numbers. we assumed that our array length is 2^64. I can store whole numbers is just 8 leafs and 7 parent node. because I store just count of numbers. than k (= 3) less less than log(n) (= 64) k >= log(n) if array is unique k < log(n) if array is recurrent – yuempek Feb 26 '21 at 14:57