How to do binary addition of real numbers?

Question

Short story. I made a program that does addition for binary integers. I need to make it work for binary real numbers (e.g. 1010.1010(binary)=10.625(decimal) The input is given as a binary string. I made a lot of attempts and I couldn't find a simple way to do it. Please help create such a program.

Example: {input: 1010.1010(10.625 in decimal) 0.1(0.5 in decimal) output: 1011.001 (11.125 in decimal)} Code:

#include <stdio.h>
#include <string.h>

void bin_add(int c[400], int d[400])
{
    int car[400]; //carry
    int i = 199;
    car[i] = 0;
    while (i >= 0)
    {
        //find carry and shift it left
        //find the sum
        car[i - 1] = (c[i] & d[i]) | (c[i] & car[i]) | (d[i] & car[i]);
        c[i] = (c[i] ^ d[i]) ^ car[i];
        printf("car[i-1]=%d c[i]=%d\n", car[i - 1], c[i]);
        i--;
    }
    //    printf("\n");
}

int main()
{
    int l, l1, i;//l and l1 are lengths
    char a[200], b[200]; //a and b are the inputs
    int c[200], d[200]; //c and d are used for processing
    for (i = 0; i < 200; i++)
    {
        c[i] = 0;
        d[i] = 0;
    }
    gets(a);
    gets(b);
    l = strlen(a);
    l1 = strlen(b);

    for (int i = 0; i < l; i++)
    {
        c[200 - l + i] = a[i] - 48;
    }

    ////////////////////////////////////////////
    for (int i = 0; i < l1; i++)
    {
        d[200 - l1 + i] = b[i] - 48;
    }
    ////////////////////////////////
    bin_add(c, d);
    for (i = 0; i < 200; i++)
        printf("%d", c[i]);
    return 0;
}

First of all (even though unrelated to your problem): Never ***ever*** use `gets`! It's [a dangerous function](https://stackoverflow.com/questions/1694036/why-is-the-gets-function-so-dangerous-that-it-should-not-be-used) and have even been removed from the C standard. Use e.g. [`fgets`](https://en.cppreference.com/w/c/io/fgets) instead. — Some programmer dude, Nov 27 '18 at 14:58
You have to line up the binary points, then it is just the same as integer addition. — stark, Nov 27 '18 at 14:58
Besides that, here are a few other points (still unrelated to your problem): Don't use [*magic numbers*](https://en.wikipedia.org/wiki/Magic_number_(programming)), if by `48` you mean `'0'` then *say* so. Also try to use more descriptive variable names. And comments about what you're doing. I also recommend that you [learn how to debug your programs](https://ericlippert.com/2014/03/05/how-to-debug-small-programs/). — Some programmer dude, Nov 27 '18 at 15:01
@Someprogrammerdude 48 is what I use to convert ASCI chars into real integers. can you help on creating the requested program, I ran out of ideas and I am short on time. — John Asimov, Nov 27 '18 at 15:08
*48 is what I use to convert ASCI chars into [...]* – What @Someprogrammerdude is telling you is, that `48` == `'0'` (and `49` == `'1'`). So you should be using `'0'` in your code instead of 48 because it works not only with ASCII but is also easier to read. — Swordfish, Nov 27 '18 at 15:14
didn't know that, thanks @Swordfish , I appreaciate the help, but would you mind showing me how to solve the problem itself? — John Asimov, Nov 27 '18 at 15:20
Start giving your variables meaningful names. `a`, `b`, `l`, `l1` etc. are not meaning ful. — Jabberwocky, Nov 27 '18 at 16:18
@Jabberwocky "l" and "l1" are the lengths of the given strings. "a" and "b" are the strings themselves. — John Asimov, Nov 27 '18 at 16:19
@JohnAsimov yes I know. Having meaningful names is more important for _you_. — Jabberwocky, Nov 27 '18 at 16:21
@Jabberwocky I added some explanation in the main code. Thanks for the previous advice, but would you be kind enough to help me create the desired program? — John Asimov, Nov 27 '18 at 16:21
@Jabberwocky I don't even know, would you be more motivated if I were to give a small bounty? — John Asimov, Nov 27 '18 at 16:30
Review `while (i >= 0) { car[i - 1] = ...`. Consider what happens when `i==0` and code attempts `car[0 - 1]` — chux - Reinstate Monica, Nov 27 '18 at 17:34

Nominal Animal · Answer 1 · 2018-11-28T02:55:09.540

What you really want to do, is handle each digit in order of increasing importance. To make that easier, you should implement the following functions:

/* Return the number of fractional bits in bs */
int bs_fractbits(const char *bs);

/* Return the number of integer bits in bs */
int bs_intbits(const char *bs);

/* Return the bit in bs corresponding to value 2**i,
   0 if outside the bit string */
int bs_bit(const char *bs, int i);

/* Return -1 if bs is negative,
           0 if bs is zero or NULL,
          +1 if bs is positive */
int bs_sign(const char *bs);

/* Return -1 if bs1 < bs2,
           0 if bs1 == bs2,
          +1 if bs1 > bs2. */
int bs_cmp(const char *bs1, const char *bs2);

To support negative values, you'll need to implement both addition and subtraction (of "unsigned" bit strings):

Addition: The result has as many fractional bits as the term that has most fractional bits, and possibly one more integer bit than the term that has most integer bits. Start at the least significant bit in either term, and work your way up to the most significant bit in either term, summing each bit, and keeping the "carry bit" along, just like you'd do by hand. If the carry is nonzero at end, you'll get that one additional bit.
Subtraction: Always subtract smaller from larger. If that changes the order of the terms, negate the result. The result has at most as many fractional bits as the term that has most fractional bits, and at most as many integer bits as the term that has most integer bits. This is just like addition, except you subtract the bits, and instead of "carry bit", you use a "borrow bit". Because you subtract smaller unsigned value from larger unsigned value, the "borrow bit" will be zero at end.
Multiplication: The integer part has the number of integer bits, and the number of fractional bits, as the terms have in total (summed). You can implement the operation as if multiplying two unsigned integer values, and just insert the bit at end. (So that the result has as many fractional bits as the input terms have in total.) This usually involves a double loop, just like in long multiplication by hand.

Note that the same logic also works if you use larger radix instead of 2. The "carry"/"borrow" is a digit, between zero and one less than the radix.

Personally, I'd be very tempted to use a structure to describe each digit string:

typedef struct {
    int     idigits;    /* Number of integral digits before point */
    int     fdigits;    /* Number of fractional digits after point */  
    int     size;       /* Number of chars dynamically allocated at data */
    char   *point;      /* Location of decimal point */
    char   *data;       /* Dynamically allocated buffer */
} digitstring;
#define  DIGITSTRING_INIT  { 0, 0, 0, NULL, NULL }

with an additional flag if negative digit strings are to be supported.

Digit D with numerical value D×Bⁱ, where B is the radix (number of unique digits used) and i being the position of said digit, is located at point[-i] if i < 0 (and -i <= fdigits), or at point[-i-1] if i >= 0 (and i < idigits). point[0] itself is where the decimal point is, if there is one.

For example, if we have string 0100.00, then idigits = 4, fdigits = 2, and the only nonzero digit is at position 2. (Position 0 is on the left side of the decimal point, and -1 on the right side.)

size and data fields allow reuse of the dynamically allocated buffer. Each declaration of a digitstring must be initialized, digitstring value = DIGITSTRING_INIT;, because there is no initialization function; this way you are less likely to leak memory (unless you forget to free a digitstring when no longer needed):

/* Free the specified digit string. */
static inline void  digitstring_free(digitstring *ds)
{
    if (ds) {
        if (ds->data)
            free(ds->data);
        ds->idigits = 0;
        ds->fdigits = 0;
        ds->size    = 0;
        ds->point   = NULL;
        ds->data    = NULL;
    }
}

To use the digit string as a C string, you use a helper function to obtain the pointer to the most significant digit in the digit string:

/* Return a pointer to a printable version of the digit string. */
static const char *digitstring_str(const digitstring *ds, const char *none)
{
    if (ds && ds->point)
        return (const char *)(ds->point - ds->idigits);
    else
        return none;
}

I've found that rather than crash, it is often useful to pass an extra parameter that is only used for the return value when the return value is otherwise undefined. For example, if you have an initialized digit string foo without any contents, then digitstring_str(&foo, "0") returns the string literal "0".

The main point of the digit string structure is to have accessor functions that get and set each individual digit:

/* Get the value of a specific digit. */
static inline unsigned int  digitstring_get(const digitstring *ds, const int position)
{
    if (ds) {
        if (position < 0) {
            if (-position <= ds->fdigits)
                return digit_to_value(ds->point[-position]);
            else
                return 0;
        } else {
            if (position < ds->idigits)
                return digit_to_value(ds->point[-position-1]);
            else
                return 0;
        }
    } else
        return 0;
}

/* Set the value of a specific digit. */
static inline void  digitstring_set(digitstring *ds, const int position, const unsigned int value)
{
    if (!ds) {
        fprintf(stderr, "digitstring_set(): NULL digitstring specified.\n");
        exit(EXIT_FAILURE);
    } else
    if (position < 0) {
        if (-position > ds->fdigits) {
            fprintf(stderr, "digitstring_set(): Digit position underflow (in fractional part).\n");
            exit(EXIT_FAILURE);
        }
        ds->point[-position] = value_to_digit(value);
    } else {
        if (position >= ds->idigits) {
            fprintf(stderr, "digitstring_set(): Digit position overflow (in integer part).\n");
            exit(EXIT_FAILURE);
        }
        ds->point[-position-1] = value_to_digit(value);
    }
}

Above, value_to_digit() is a helper function that converts a numerical value to the corresponding character, and digit_to_value() converts a character to the corresponding numerical value.

All operations (from parsing to arithmetic operators) really need a "constructor", that creates a new digit string with sufficient number of digits. (The number of digits is known beforehand for each operation, and depends only on the number of significant digits in the terms.) For this, I created a function that constructs a zero of desired size:

/* Clear the specified digit string to zero. */
static inline void  digitstring_zero(digitstring *ds, int idigits, int fdigits)
{
    int   size;
    char *data;

    if (!ds) {
        fprintf(stderr, "digitstring_zero(): No digitstring specified.\n");
        exit(EXIT_FAILURE);
    }

    /* Require at least one integral digit. */
    if (idigits < 1)
        idigits = 1;
    if (fdigits < 0)
        fdigits = 0;

    /* Total number of chars needed, including decimal point
       and string-terminating nul char. */
    size = idigits + 1 + fdigits + 1;

    /* Check if dynamically allocated buffer needs resizing. */
    if (ds->size < size) {
        if (ds->data)
            data = realloc(ds->data, size);
        else
            data = malloc(size);
        if (!data) {
            fprintf(stderr, "digitstring_zero(): Out of memory.\n");
            exit(EXIT_FAILURE);
        }
        ds->data = data;
        ds->size = size;
    } else {
        data = ds->data;
        size = ds->size;
    }

    /* Fill it with zeroes. */
    memset(data, value_to_digit(0), idigits + 1 + fdigits);

    /* Pad the unused space with nul chars, terminating the string. */
    memset(data + idigits + 1 + fdigits, '\0', size - idigits - 1 - fdigits);

    /* Assign the decimal point. */
    ds->point = data + idigits;

    /* If there are no decimals, no need for a decimal point either. */
    if (fdigits > 0)
        ds->point[0] = decimal_point;
    else
        ds->point[0] = '\0';

    /* After setting the desired digits, use digitstring_trim(). */
    ds->idigits = idigits;
    ds->fdigits = fdigits;
}

It will ensure the digit string has enough room for the specified number of digits, reallocating its dynamically allocated buffer if necessary, reusing it if already large enough.

The idea is that to implement an operation, you first find out the maximum number of integral and fractional digits the result can have. You use the above to create the result digit string, then digitstring_set() to set each digit to their respective values. You will typically operate in increasing digit significance, which means increasing digit "positions".

If we have additional helper functions int digits(const char *src), which returns the number of consecutive valid digit characters starting at src, and int decimal_points(const char *src), which returns 1 if src points to a decimal point, and 0 otherwise, we can parse input strings into digit strings using

/* Parse a string into a digit string, returning the pointer
   to the first unparsed character, or NULL if an error occurs. */
static const char *digitstring_parse(digitstring *ds, const char *src)
{
    const int    zero = value_to_digit(0);
    const char  *idigit, *fdigit;
    int          idigits, fdigits, fextra, n;

    /* Fail if nothing to parse. */
    if (!src)
        return NULL;

    /* Skip leading whitespace. */
    while (isspace((unsigned char)(*src)))
        src++;

    /* Fail if nothing to parse. */
    if (*src == '\0')
        return NULL;

    /* Scan integer digits. */
    idigit = src;
    src += digits(src);
    idigits = (int)(src - idigit);

    /* Decimal point? */
    fextra = 0;
    n = decimal_points(src);    
    if (n > 0) {
        src += n;
        /* Scan fractional digits. */
        fdigit = src;
        src += digits(src);
        fdigits = (int)(src - fdigit);
        if (fdigits < 1)
            fextra = 1;
    } else {
        fdigit = src;
        fdigits = 0;
    }

    /* No digits? */
    if (idigit == 0 && fdigit == 0)
        return NULL;

    /* Trim leading zeroes. */
    while (idigits > 1 && *idigit == zero) {
        idigits--;
        idigit++;
    }

    /* Trim trailing zeroes. */
    while (fdigits > 1 && fdigit[fdigits - 1] == zero)
        fdigits--;

    /* Create the necessary digit string, */
    digitstring_zero(ds, idigits, fdigits + fextra);

    /* copy the integer digits, if any, */        
    if (idigits > 0)
        memcpy(ds->point - idigits, idigit, idigits);

    /* and the fractional digits, if any. */
    if (fdigits > 0)
        memcpy(ds->point + 1, fdigit, fdigits);

    /* Return a pointer to the first unparsed character. */
    return src;
}

After updating its digits, one can call a helper function to remove any extra leading zeroes:

static inline void  digitstring_ltrim(digitstring *ds)
{
    if (ds && ds->point) {
        const int  zero = value_to_digit(0);

        while (ds->idigits > 1 && ds->point[-ds->idigits] == zero)
            ds->idigits--;
    }
}

Adding two (unsigned) digit strings, possibly reusing one of the terms, is now quite simple to implement:

static void digitstring_add(digitstring *to, const digitstring *src1, const digitstring *src2)
{
    digitstring  result = DIGITSTRING_INIT;
    unsigned int carry = 0;
    int          i, idigits, fdigits;

    if (!to || !src1 || !src2) {
        fprintf(stderr, "digitstring_add(): NULL digitstring specified.\n");
        exit(EXIT_FAILURE);
    }

    /* For addition, the result has as many digits
       as the longer source term. */
    idigits = (src1->idigits >= src2->idigits) ? src1->idigits : src2->idigits;
    fdigits = (src1->fdigits >= src2->fdigits) ? src1->fdigits : src2->fdigits;

    /* Result needs possibly one more integer digit,
       in case carry overflows. */
    digitstring_zero(&result, idigits + 1, fdigits);

    /* Addition loop, in order of increasing digit significance. */
    for (i = -fdigits; i < idigits; i++) {
        const unsigned int  sum = digitstring_get(src1, i)
                                + digitstring_get(src2, i)
                                + carry;
        digitstring_set(&result, i, sum % RADIX);
        carry = sum / RADIX;
    }
    digitstring_set(&result, idigits, carry);

    /* Trim leading zeroes. */
    digitstring_ltrim(&result);

    /* At this point, we can discard the target, even if it is actually
       one of the sources, and copy the result to it. */
    digitstring_free(to);
    *to = result;
}

where RADIX is the radix used (the number of unique digits, 2 for binary). Pay extra attention to the digit loop. -fdigits is the least significant position in the result, and idigits-1 the most significant position. We need the accessor functions, because the source terms might not contain those digits at all (they are logically zero then).

These functions have been tested to work on both binary and octal number strings. I like this implementation, because it omits the decimal point if all terms are integers (so you get 12 + 33 = 45), but (due to fextra in digitstring_parse()) if any of the terms have a decimal point, then the result will have at least one fractional digit (so 12. + 33 = 45.0).

Minor: The comment "Most significant nonzero integer digit" for `.idigits` would benefit rewording as "Most significant nonzero integer digit offset" as it is not about the digit, but its position. Same for ``.fdigits`. — chux - Reinstate Monica, Nov 27 '18 at 17:57
@chux I tried to implement this into an actual code and I fail every time. could you please write something functional so I could understand better. I managed to do the product and division. I have problems only with addition and subtraction. — John Asimov, Nov 27 '18 at 19:07
@JohnAsimov: The "trick" is that you need to align the terms so that the decimal points are in the same spot, and "logically pad" them with zeroes so they're the same size; and *then* do the addition. If you implement `bs_bit()`, you can use it instead of modifying the input bitstrings. Using `bs_intbits()` and `bs_fractbits()` tells you the bit numbers for your addition loop, as well as the buffer (and decimal point position) you need for the result. — Nominal Animal, Nov 27 '18 at 22:16

score 0 · Answer 2 · answered Nov 27 '18 at 16:35

There are two answers, depending upon whether you desire fixed- or floating- point arithmetic.

The first issue is reading the number. strtol() is your friend here:

char input[BUFFER_SIZE];
char * tmp;
long integral, fractional;
fgets(input, BUFFER_SIZE-1, stdin);
integral = strtol(input, &tmp, 2); /* Read BINARY integral part */
tmp++; /* Pass over the binary point. You may want to check that it is actually a dot */
fractional = strtol(tmp, null, 2); /* Read BINARY fractional part */

The next issue is figuring out how you will do the arithmetic. fractional must be bit-shifted an amount depending on how many digits past the point were provided and your desired precision. Fixed point arithmetic is simple: fractional <<= FRAC_BITS - strlen(tmp) then add the fractional parts together. Mask by ((1<<FRAC_BITS)-1) for the fractional part of the sum, shift the remaining bits and add them to the integral parts for the integral part of the sum. Floating-point is a little more finicky, but not too much harder.

I know I am asking much, but would you be so kind to create a full program? I DO understand the theory, but I don't have enough knowledge to implement it (yes it would be better for me If I tried to do it myself, but I am learning better from written-programs) — John Asimov, Nov 27 '18 at 16:42
StackOverflow is for getting advice, not for getting code written for you. Try writing it yourself using my advice, and if it's still not working, come back with a specific question. — Gold Dragon, Nov 27 '18 at 16:47

Swordfish · Accepted Answer · 2018-11-28T20:47:13.873

After all the beautiful ideas in Animals' answer I felt the strange urge, to present my own solution:

#include <stdbool.h>
#include <stddef.h>
#include <string.h>
#include <stdlib.h>
#include <stdio.h>

#define MAX(x, y) ((x) > (y) ? (x) : (y))

size_t gpp(char const *s)
{
    char *n = strchr(s, '.');
    return n ? n - s + 1 : 0;
}

char* bin_add(char const *a, char const *b)
{
    char const *inp[] = { a, b };
    size_t ll[] = { strlen(a), strlen(b) };
    size_t pp[] = { gpp(a), gpp(b) };
    size_t OO[2], off[2];

    for (size_t i = 0; i < 2; ++i) {
        OO[i] = pp[i] ? pp[i] - 1 : ll[i];
        pp[i] = pp[i] ? ll[i] - pp[i] : 0;}

    for (size_t i = 0; i < 2; ++i)
        off[i] = OO[i] < OO[!i] ? OO[!i] - OO[i] : 0;

    size_t ML = MAX(OO[0], OO[1]) + MAX(pp[0], pp[1]) + (!!pp[0] || !!pp[1]);
    char *Ol = calloc(ML + 2, 1);
    if(!Ol) return Ol;

    char ops[2];
    int xc = 0;
    size_t lO = ML;
    unsigned cc[2] = { 0 };
    for (size_t i = ML; i; --i) {
        bool pt = false;
        for (size_t l = 0; l < 2; ++l) {
            ops[l] = i <= ll[l] + off[l] && i - off[l] - 1
                       <  ll[l] ? inp[l][i - off[l] - 1] : '0';
            if (ops[l] == '.') {
                if (cc[l]) {
                    free(Ol);
                    return NULL;
                }
                pt = true;
                ++cc[l];
            }
            ops[l] -= '0';
        }
        if (pt) {
            Ol[i] = '.';
            continue;
        }
        if ((Ol[i] = ops[0] + ops[1] + xc) > 1) {
            Ol[i] = 0;
            xc = 1;
        }
        else xc = 0;
        lO = (Ol[i] += '0') == '1' ? i : lO;
    }
    if((Ol[0] = '0' + xc) == '1') return Ol;
        for (size_t i = 0; i <= ML - lO + 1; ++i)
            Ol[i] = Ol[lO + i];

    return Ol;
}

int main(void)
{
    char a[81], b[81];

    while (scanf(" %80[0.1] %80[0.1]", a, b) & 1 << 1) {
        char *c = bin_add(a, b);
        if (!c && errno == ENOMEM) {
            fputs("Not enough memory :(\n\n", stderr);
            return EXIT_FAILURE;
        }
        else if (!c) {
            fputs("Input error :(\n\n", stderr);
            goto clear;
        }

        char* O[] = { a, b, c };
        size_t lO[3], Ol = 0;
        for (size_t i = 0; i < 3; ++i) {
            lO[i] = gpp(O[i]);
            lO[i] = lO[i] ? lO[i] : strlen(i[O]) + 1;
            Ol = lO[i] > Ol ? lO[i] : Ol;
        }

        putchar('\n');
        for (size_t i = 0; i < 3; ++i) {
            for (size_t l = 0; l < Ol - lO[i]; ++l, putchar(' '));
                puts(O[i]);
        }
        putchar('\n');

        free(c);
        clear :{ int c; while ((c = getchar()) != '\n' && c != EOF); }
    }
}

Sample Output:

11001001 .11001001

11001001
        .11001001
11001001.11001001

11001001 1010

11001001
    1010
11010011

111111 1

 111111
      1
1000000

1010101 010111001.0101110101010

  1010101
010111001.0101110101010
100001110.0101110101010

1001001.010111010101 10100101100.10010111101

    1001001.010111010101
10100101100.10010111101
10101110101.111000001111

. .

 .
 .
0

.. .
Input error :(

A
Press any key to continue . . .

I contemplated wheter I should ask for a review at codereview. But I think I schould rather not.

score -2 · Answer 4 · answered Nov 27 '18 at 15:16

-2

For real numbers, convert non-fraction and fraction part to decimal, do the addition and print it as binary. This will require function to convert a number to binary string. Just a note that real numbers are float numbers in C and they are represented in binary with mantessa form like 2e^3 which is 2 multiplied by exponent to the power of 3.

answered Nov 27 '18 at 15:16

anand

163
7

Would you mind helping me do that. I have written functions for the things you said I need, but they don't return a result, but rather string manipulation. – John Asimov Nov 27 '18 at 15:19

How to do binary addition of real numbers?

4 Answers4

Sample Output: