6

To generate a UFI number, I use a bitset of size 74. To perform step 2 of UFI generation, I need to convert this number:

9 444 732 987 799 592 368 290
(10000000000000000000000000000101000001000001010000011101011111100010100010)

into:

DFSTTM62QN6DTV1

by converting the first representation to base 31 and getting the equivalent chars from a table.

#define PAYLOAD_SIZE 74
// payload = binary of 9444732987799592368290
std::bitset<PAYLOAD_SIZE> bs_payload(payload);
/*
perform modulo 31 to obtain:
12(D), 14(F), 24(S), 25(T), 25, 19, 6, 2, 22, 20, 6, 12, 25, 27, 1
*/    

Is there a way to perform the conversion on my bitset without using an external BigInteger library?

Edit: I finally done a BigInteger class even if the Cheers and hth. - Alf's solution works like a charm

thibsc
  • 3,747
  • 2
  • 18
  • 38
  • 1
    Some compilers provide (non-standard) 128-bit integer type. It is `__int128` in GCC. Does your compiler have it? If so, you can convert your bitset to an integer of this type (bit by bit or with reinterpret_casts, though be careful with big/little endian in the latter case) and perform division with integers directly. – Ivan Smirnov Jul 26 '18 at 14:53
  • I don't understand, do you want to apply only modulo 31, once ? – Benjamin Barrois Jul 26 '18 at 14:55
  • @IvanSmirnov I use the visual c++ and the int128 is not supported. – thibsc Jul 26 '18 at 14:57
  • Do mean you want to perform modulo 31 operation the number? – yadhu Jul 26 '18 at 14:58
  • @BenjaminBarrois, I want to apply modulo 31 until the value is 0 – thibsc Jul 26 '18 at 14:58
  • 1
    That means "convert to base 31", actually. – Ivan Smirnov Jul 26 '18 at 14:58
  • @IvanSmirnov, Yes sorry, I edit it, it's convert to base 31 – thibsc Jul 26 '18 at 15:04
  • @ThibautB.: Visual C++ 2017 does support `__int128_t`. – Cheers and hth. - Alf Jul 26 '18 at 15:09
  • You could just try it. It doesn't seem to be documented. – Cheers and hth. - Alf Jul 26 '18 at 15:16
  • @Cheersandhth.-Alf, On a VM with 2017 or 2015 effectively, `__int128_t` is usable, but my last answer is wrong, we are not on 2015 but 2010 (it is old I know). – thibsc Jul 26 '18 at 15:26
  • 1
    Okay then, all that my association circuits cough up is that something of the kind, like modulo with 2^n-1, was discussed in the first volume of Knuth's "The Art of Computer Programming". Also maybe Chinese remainder theorem, but I don't remember anything about it. Those are nice things to check out anyway. If that doesn't help then it seems you'll have to either implement general division of unsigned integers. Or use a bignum library ([Boost multiprecision](https://www.boost.org/doc/libs/1_58_0/libs/multiprecision/doc/html/boost_multiprecision/tut/ints.html) comes to mind). – Cheers and hth. - Alf Jul 26 '18 at 15:33
  • Oh, another one. You can compute the value of the binary number in base 31. That involves only multiplication and addition. The last base 31 digit is your mod. – Cheers and hth. - Alf Jul 26 '18 at 15:36
  • @Cheersandhth.-Alf, thank you for your help, I have *The art of computer programming* Vol.1 Second edition, I explore this book, and see for your second suggestion – thibsc Jul 26 '18 at 15:40
  • @ThibautB. Weird, `__int128_t` no longer works with Visual C++ 2017, which means it didn't work earlier either. Hm! Anyway I cooked up some code, posting as answer now. – Cheers and hth. - Alf Jul 26 '18 at 16:35
  • @Cheersandhth.-Alf I don't know about VS 2017 but it has [never actually worked on MSVC](https://stackoverflow.com/questions/6759592/how-to-enable-int128-on-visual-studio#comment63181365_6761962) – phuclv Jul 27 '18 at 17:18
  • @Cheersandhth.-Alf [MS confirmed they won't support 128-bit in on VS 2017](https://visualstudio.uservoice.com/forums/121579-visual-studio-ide/suggestions/2643778-support-int128-on-64-bit-platform) – phuclv Jul 29 '18 at 03:49
  • your bigint implementation stores the value as string, which is very inefficient both on memory usage and performance since one digit takes one byte. Big int libraries typically use [base 2⁶⁴](https://stackoverflow.com/q/23840565/995714) for maximum efficiency on 64-bit computers, or base 10¹⁹ if the main operations are input/output – phuclv Sep 06 '18 at 10:20
  • @phuclv, Thank you for this tip, I will change it when I have time. For our currently need, it's enough, but if we need better performance I keep your message in a corner ;) – thibsc Sep 06 '18 at 13:06

3 Answers3

3

To get modulo 31 of a number you just need to sum up the digits in base 32, just like how you calculate modulo 3 and 9 of a decimal number

unsigned mod31(std::bitset<74> b) {
    unsigned mod = 0;
    while (!b.none()) {
        mod += (b & std::bitset<74>(0x1F)).to_ulong();
        b >>= 5;
    }
    while (mod > 31)
        mod = (mod >> 5) + (mod & 0x1F);
    return mod;   
}

You can speedup the modulo calculation by running the additions in parallel like how its done here. The similar technique can be used to calculate modulo 3, 5, 7, 15... and 231 - 1

However since the question is actually about base conversion and not about modulo as the title said, you need to do a real division for this purpose. Notice 1/b is 0.(1) in base b + 1, we have

1/31 = 0.000010000100001000010000100001...32 = 0.(00001)32

and then N/31 can be calculated like this

N/31 = N×2-5 + N×2-10 + N×2-15 + ...

uint128_t result = 0;
while (x)
{
    x >>= 5;
    result += x;
}

Since both modulo and division use shift-by-5, you can also do both them together in a single loop.

However the tricky part here is how to round the quotient properly. The above method will work for most values except some between a multiple of 31 and the next power of 2. I've found the way to correct the result for values up to a few thousands but yet to find a generic way for all values

You can see the same shift-and-add method being used to divide by 10 and by 3. There are more examples in the famous Hacker's Delight with proper rounding. I didn't have enough time to read through the book to understand how they implement the result correction part so maybe I'll get back to this later. If anyone has any idea to do that it'll be grateful.

One suggestion is to do the division in fixed-point. Just shift the value left so that we have enough fractional part to round later

uint128_t result = 0;
const unsigned num_fraction = 125 - 75 // 125 and 75 are the nearest multiple of 5
// or maybe 128 - 74 will also work
uint128_t x = UFI_Number << num_fraction; 

while (x)
{
    x >>= 5;
    result += x;
}
// shift the result back and add the fractional bit to round
result = (result >> num_fraction) + ((result >> (num_fraction - 1)) & 1)

Note that your result above is incorrect. I've confirmed the result is CEOPPJ62MK6CPR1 from both Yaniv Shaked's answer and Wolfram alpha unless you use different symbols for the digits

phuclv
  • 37,963
  • 15
  • 156
  • 475
  • 1
    I have test this code, maybe it's because it is the end of the daywork but, I can't get the expected values `[12, 14, 24, 25, 25, 19, 6, 2, 22, 20, 6, 12, 25, 27, 1]` – thibsc Jul 26 '18 at 18:06
  • 1
    To solve the problem, `div31` needed as well, doesn't it? – geza Jul 26 '18 at 18:29
  • 1
    ok I misread the question. It's about *base conversion* and **not** modulo, so you definitely need a division – phuclv Jul 27 '18 at 00:49
  • @phuclv, Thank you for your time, the 31 base conversion give **CEOPPJ62MK6CPR1** but not in the case of the UFI number generation because they remove the ambigous characters like *[I, B...]* so I just need the int conversion value to get the correct base31 char according to the UFI rules. – thibsc Jul 27 '18 at 20:10
1

This code seems to work. To guarantee the result I think you need to do additional testing. E.g. first with small numbers where you can compute the result directly.

Edit: Oh, now I noticed you posted the required result digits, and they match. Means it's generally good, but still not tested for corner cases.

#include <assert.h>
#include <algorithm>            // std::reverse
#include <bitset>
#include <vector>
#include <iostream>
using namespace std;

template< class Type > using ref_ = Type&;

namespace base31
{
    void mul2( ref_<vector<int>> digits )
    {
        int carry = 0;
        for( ref_<int> d : digits )
        {
            const int local_sum = 2*d + carry;
            d = local_sum % 31;
            carry = local_sum / 31;
        }
        if( carry != 0 )
        {
            digits.push_back( carry );
        }
    }

    void add1( ref_<vector<int>> digits )
    {
        int carry = 1;
        for( ref_<int> d : digits )
        {
            const int local_sum = d + carry;
            d = local_sum % 31;
            carry = local_sum / 31;
        }
        if( carry != 0 )
        {
            digits.push_back( carry );
        }
    }

    void divmod2( ref_<vector<int>> digits, ref_<int> mod )
    {
        int carry = 0;
        for( int i = int( digits.size() ) - 1; i >= 0; --i )
        {
            ref_<int> d = digits[i];
            const int divisor = d + 31*carry;
            carry = divisor % 2;
            d = divisor/2;
        }
        mod = carry;
        if( digits.size() > 0 and digits.back() == 0 )
        {
            digits.resize( digits.size() - 1 );
        }
    }
}


int main() {
    bitset<74> bits(
        "10000000000000000000000000000101000001000001010000011101011111100010100010"
        );
    vector<int> reversed_binary;
    for( const char ch : bits.to_string() ) { reversed_binary.push_back( ch - '0' ); }

    vector<int> base31;
    for( const int bit : reversed_binary )
    {
        base31::mul2( base31 );
        if( bit != 0 )
        {
            base31::add1( base31 );
        }
    }

    { // Check the conversion to base31 by converting back to base 2, roundtrip:
        vector<int> temp31 = base31;
        int mod;
        vector<int> base2;
        while( temp31.size() > 0 )
        {
            base31::divmod2( temp31, mod );
            base2.push_back( mod );
        }
        reverse( base2.begin(), base2.end() );
        cout << "Original     : " << bits.to_string() << endl;
        cout << "Reconstituted: ";
        string s;
        for( const int bit : base2 ) { s += bit + '0'; cout << bit; };  cout << endl;
        assert( s == bits.to_string() );
    }

    cout << "Base 31 digits (msd to lsd order): ";
    for( int i = int( base31.size() ) - 1; i >= 0; --i )
    {
        cout << base31[i] << ' ';
    }
    cout << endl;

    cout << "Mod 31 = " << base31[0] << endl;
}

Results with MinGW g++:

Original     : 10000000000000000000000000000101000001000001010000011101011111100010100010
Reconstituted: 10000000000000000000000000000101000001000001010000011101011111100010100010
Base 31 digits (msd to lsd order): 12 14 24 25 25 19 6 2 22 20 6 12 25 27 1
Mod 31 = 1
Cheers and hth. - Alf
  • 142,714
  • 15
  • 209
  • 331
0

I did not compile the psuedo code, but you can get the generate understanding of how to convert the number:

// Array for conversion of value to base-31 characters:
char base31Characters[] = 
{
    '0',
    '1',
    '2',
    ...
    'X',
    'Y'
};

void printUFINumber(__int128_t number)
{
    string result = "";
    while (number != 0)
    {
        var mod = number % 31;
        result = base31Characters[mod] + result;
        number = number / 31;
    }
    cout << number;
}
Yaniv Shaked
  • 698
  • 6
  • 12
  • Thank you for your 31 base convertion, but my problem is that I can't use `__int128_t` type – thibsc Jul 26 '18 at 15:43
  • I'm not sure if you want to go in this direction, but there are 128 bit implementations out there whenever the platform/compiler doesn't support them: https://github.com/calccrypto/uint128_t – Yaniv Shaked Jul 26 '18 at 15:58
  • I will use this solution only as a last resort, I keep this link ;) – thibsc Jul 26 '18 at 16:02