Arithmetic with base 256 number system

Question

I am working on a project right now where I need to perform calculations for numbers with base 256. I am reading bytes of a file and storing it in an array of uint8_t (a.k.a unsigned char or BYTE). The largest supported number data type doesn't satisfy the needs of my project. So the array of bytes is acting like a custom length/size data type (BigInt).

I now need to perform arithmetic on it like: -1, /2, %2.

For example this is how addition looks like to demonstrate how my numbers should work:

9   + 1 = (10)
99  + 1 = (100)
255 + 1 = (1)+(0)<<8
255 + 2 = (1)+(1)<<8

Note: First one's is 10 as in 10 is occupying 1 digit whereas the third one is 1 0 as in it's occupying 2 digits. Also I cant convert them to integers because I have to deal with huge numbers.

I am racking my brain trying to think of ways to implement this in C but to no avail.

My code so far:

#include<stdio.h>
#include<string.h>
#include<stdint.h>
#include<stdlib.h>

typedef uint8_t BYTE;
BYTE buffer[1000000000];

void n(){
    printf("\n");
}

int main()
{
    uint32_t  x;
    x = 0; 
    int _len,y;
    char * test    = "test.bin";
    FILE * inptr = fopen(test,"rb");
    uint32_t i=0;
    while(fread(&buffer[i],1,1,inptr));
}

`9 + 1` is not `10` in this system. You need `256` distinct "digits" for representing it. — Eugene Sh., Jul 20 '17 at 15:16
I do not understand, what your problem is. You can be sure that we understand how arithmetics of the non 10 bases works. You did not show any code. Ask when you have something ready. — 0___________, Jul 20 '17 at 15:18
see [Cant make value propagate through carry](https://stackoverflow.com/a/26603589/2521214) just instead of `DWORD` change to `BYTE` and update the bit-masks and shifts accordingly — Spektre, Jul 20 '17 at 15:19
Consider your arrays as polynomials (with x == 256). You can apply the same algorithms then as you would do with polynomial arithmetic. Only: you need to consider overflows! — Aconcagua, Jul 20 '17 at 15:19
@EugeneSh is right. think about what happen in base 16 (which should be more familiar to you) `9+1=A` — Davide Spataro, Jul 20 '17 at 15:20
He should add something on the beginning of his number to show whish number has what base for example @. Then @9@ + @1@ = @10@ or @255@ + @1@ = @1@@0@ — 0___________, Jul 20 '17 at 15:22
@DavideSpataro @EugeneSh. Consider `(9)+(1) = (10)`, `(255) + (2) = (1)(1)`, so your 256 digits are the values 0-255 placed in parentheses... — Aconcagua, Jul 20 '17 at 15:23
@Aconcagua Right. If that is what the OP meant, then I'll amend my comment. BTW, `(255) + (1) = (1)(0)` — Eugene Sh., Jul 20 '17 at 15:25
@Aconcagua yes, and I think that the space between `1` and `0` in `255+1= 1 0` was supposed to show that. Find it very confusing though. Parens would have been appreciated :D — Davide Spataro, Jul 20 '17 at 15:26
@EugeneSh. Intended to take first and last example, so actually would have been (2) instead of (1)... Luckily you spotted in time, so I could adjust my comment... — Aconcagua, Jul 20 '17 at 15:29
@EugeneSh. https://pastebin.com/V1wxVRuB Here's my code. I do need distinct digits, I tried displaying them as characters but only there are a lot of unprintable ascii characters. Also, the first 10 got stored as 00001010 whereas second 1 0 got stored as 00000001 00000000. — rjpj1998, Jul 20 '17 at 15:30
@rjpj1998 Yeah, as commented you representation is fine, just needed some clarification. — Eugene Sh., Jul 20 '17 at 15:31
@Spektre I am new at working with large integers and I don't understand what you are saying. — rjpj1998, Jul 20 '17 at 15:33
@EugeneSh. Could you help with me to where I went wrong that I am getting downvoted? — rjpj1998, Jul 20 '17 at 15:35
I would guess that the downvotes are caused by not presenting any of your work in the question and not stating clearly what the problem is. — Eugene Sh., Jul 20 '17 at 15:36
@DavideSpataro and EugeneSh. surely the 256 digit values required can only be 0 to 255, so the `9+1=10` was correct. OP did say that is not the same as `1 0`. — Weather Vane, Jul 20 '17 at 15:39
@WeatherVane The discussion above have clarified it. If you consider `10` as a "digit" and not a "two digit number", that would be fine. — Eugene Sh., Jul 20 '17 at 15:42
See [Arbitrary-precision arithmetic Explanation](https://stackoverflow.com/q/1218149/1468366). — MvG, Jul 20 '17 at 21:45
@spektre i really just want to perform some calculations on extremely large numbers. However i do plan on looking at the implementation of alu and the carry flag because they were plastered over all other answers when i tried googling my problem. — rjpj1998, Jul 21 '17 at 07:28
@stark i do infact need like to operate values like 2^1000000000. — rjpj1998, Jul 21 '17 at 07:30
@rjpj1998 ALU is the basic stuff you should build your operations on ... In pure C/C++ you do not have access to carry flag which make impossible to chain operations... but you can use 16 bit variables to do 8 bit addition and extract the 8th bit which is carry ... to simulate ALU. also `2^1000000000` is not the same as yours `256^1000000000` !!! you should use array size `N=1000000000/8` or even better dynamic arrays ... as your current number is taking 1GB of memory — Spektre, Jul 21 '17 at 07:31
@Spektre Can I not utilize gmp to perform my task? Also you are right it's actually 256^1000000000. — rjpj1998, Jul 21 '17 at 07:41
@Spektre I cant afford to work with strings. Using strings will make my program eat up much more memory. — rjpj1998, Jul 21 '17 at 07:51
However I can display them directly in hex using the hex specifier when needed. — rjpj1998, Jul 21 '17 at 07:51
I dont have to print it man. I just need to process it and write the processed info into another file. — rjpj1998, Jul 21 '17 at 07:57
I know that. First i plan to test my program with relatively small files like 1mb. — rjpj1998, Jul 21 '17 at 08:06
@Spektre I need to process quite a lot of big ints but not the way you think. I get smaller bigints from the first bigint. which then gives me smaller big ints. Till the point that they can be written using a single byte or sometimes a 4 byte integer. — rjpj1998, Jul 21 '17 at 08:28
@spektre i need to perform modulus by 2,division by 2 and subtraction by 1 — rjpj1998, Jul 21 '17 at 08:29
@Spektre This is exactly what I was after! I can verify this as an answer if you submit this comment as an answer. — rjpj1998, Jul 21 '17 at 08:40

Spektre · Accepted Answer · 2017-07-22T15:52:15.690

So you have power of 2 operations which can be easily converted to bit operations... no bigint lib is needed for such triviality. Let assume you got number

const int n=1000000000; // number size in bytes
BYTE x[n]; // number bytes let num[0] be LSW (least signifficant Word)

So the operations involved are:

mod: y = x%2 = x&1

this is O(1)
```
BYTE y = x[0]&1;
```
result is single bit so no need to store it as bigint

div: y = x/2 = x>>1

this is O(n) and the result is also bigint so

int i;
BYTE y[n]; // result
BYTE cy; // carry flag
for (cy=0,i=n-1;i>=0;i--) // process all words from MSW to LSW
 {
 y[i] = ((x[i]>>1)&0x7F) | cy; // bitshifted word + last carry
 cy = (x[i]<<7)&0x80; // carry is shifted out lsb of word shifted to msb position
 }

dec: y=x-1

this is O(n) and the result is also bigint so

int i;
BYTE y[n]; // result
BYTE cy; // carry flag
for (cy=1,i=0;(i<n)&&(cy);i++) // process all words from LSW to MSW
 {
 y[i] = (x[i]-cy)&0xFF; // y[i] = sbc x[i],0
 cy = (x[i]==0x00); // carry
 }

Hope I did not make some silly syntax error or something as I coded this directly into here...

Both O(n) operations can be done in-place you just need to buffer actual x[i] value or compute carry before x[i] change and buffer old carry

In your case I would use 32bit (DWORD) or 64bit (QWORD) instead of 8bit (BYTE) that will boost the speed as the ALU on most computers is 32 or 64 bit anyway

If you're interested in implementing more of the bigint stuff see:

[Edit1] dec for MSW first

int i;
BYTE y[n]; // result
BYTE cy; // carry flag
for (cy=1,i=n-1;(i>=0)&&(cy);i--) // process all words from LSW to MSW
 {
 y[i] = (x[i]-cy)&0xFF; // y[i] = sbc x[i],0
 cy = (x[i]==0x00); // carry
 }

and in-place:

int i;
BYTE cy; // carry flag
for (cy=1,i=n-1;(i>=0)&&(cy);i--) // process all words from LSW to MSW
 {
 x[i] = (x[i]-cy)&0xFF; // y[i] = sbc x[i],0
 if (x[i]==0xFF) cy=1; // carry
 }

Hey, I am having problem with subtraction. First for me x[0] is the most significant number. So I changed i=0 to i= len-1 and i=0. This is only working for 2 digit numbers. Could you help? — rjpj1998, Jul 22 '17 at 12:50
for(i =len-1 ;i>=0;i--) { if (buffer[i]!=0) { buffer[i] = buffer[i]-1; if (i<(len-1)) { for (j=i+1;j — rjpj1998, Jul 22 '17 at 13:06
@rjpj1998 hi I added the implementation of `y=x-1` with MSW first encoding as edit1 — Spektre, Jul 22 '17 at 15:53

Arithmetic with base 256 number system

1 Answers1

Linked