1

I would like to convert a very long (arbitrary length, possibly 1000 characters long; university assignment) string into binary. How should I approach this problem? I have thought about it for a while, but I just can't seem to think of anything viable.

The string will be passed to me as const char *str. I want to read the number, which will be in Base 10, and convert it into binary.

Should I read certain number of least significant numbers and store them in unsigned long long int, and then work from there? Is there a better solution? I don't know how the method I suggested would pan out. I wanted to know if there's a better/easier way to do it.

Thank you.

Smile
  • 31
  • 4
  • When you say `string`, do you actually mean "a string which represents a _number_ in a _specific base_"? – Captain Trojan Aug 20 '22 at 08:48
  • Yes, that's right, @CaptainTrojan. – Smile Aug 20 '22 at 09:26
  • 1
    [Long division](https://en.wikipedia.org/wiki/Long_division) by `2` (save the remainder for the binary representation) is pretty straight forward :) `"160000000000000000000" / 2 ==> "80000000000000000000"` – pmg Aug 20 '22 at 11:22
  • 2
    Can you convert a decimal number to binary with a pencil and paper? If so, just do the exact same thing in C. If not, you need to figure out that pencil and paper thing first. – n. m. could be an AI Aug 20 '22 at 11:29
  • 1
    It's all about implementing arbitrary precision multiplication by 10 (/ a fixed-precision number). You'll need that to read the number in (into a standard binary representation where a bit at a certain position corresponds to some corresponding power of two). Once it's read in, printing it in base 2 (or any power-of-two base) will be trivial (just print successive bitgroups where bitgroup size is lg2(the_power_of_two_base), i.e., 1 for 2, 2 for 4, 3 for 8, etc.; so for base two you just print successive bits, in reverse if you use a little endian representation for your bignum). – Petr Skocik Aug 20 '22 at 11:34
  • @sBot You say pure nonsense - have a look at https://stackoverflow.com/questions/117293/use-of-const-for-function-parameters – Aaron Aug 20 '22 at 12:17

3 Answers3

0

Assuming your input is too large for the biggest integer type, you have to convert it to a unlimited size integer. For this purpose you can use gmplib. If you are not allowed to use external libraries, you can use a different approach:

  • is your string divisible by two (look at the last digit)?
    • if yes, write 0 to left side of your output
    • else, write 1 to left side of your output
  • divide the string by 2 (every digit)
  • repeat while string is not filled with 0

I am going to edit this answer, as soon as I wrote the code.

Here you go:

#include<stdbool.h>
#include<stdlib.h>
#include<memory.h>
#include<stdio.h>

typedef struct char_queue {
    unsigned int len;
    unsigned int capacity;
    char* data;
} char_queue;

char_queue init_char_queue() {
    return (char_queue) {
        0,
        4096,
        malloc(4096)
    };
}

void enqueue(char_queue* queue, char val) {
    if (queue->len == queue->capacity) {
        char* new_queue_data = malloc(queue->capacity + 4096);
        memmove(new_queue_data, queue->data, queue->capacity);
        free(queue->data);
        queue->data = new_queue_data;
    }
    queue->len++;
    queue->data[queue->capacity - queue->len] = val;
}

char* queue_get_arr(char_queue* queue) {
    char* output = malloc(queue->len);
    memcpy(output, &queue->data[queue->capacity - queue->len], queue->len);
    return output;
}

void free_char_queue(char_queue* queue) {
    if (queue->data) free(queue->data);
}

void convert_to_digit_arr(char* input, unsigned int len) {
    for (unsigned int i = 0; i < len; i++) {
        input[i] = input[i] - '0'; // '5' - '0' = 5
    }
}

bool is_null(char* input, unsigned int len) {
    for (unsigned int i = 0; i < len; i++) {
        if (input[i] != 0) return false;
    }
    return true;
}

bool divisible_by_two(char* digit_arr, unsigned int len) {
    return digit_arr[len - 1] % 2 == 0;
}

void divide_by_two(char* digit_arr, unsigned int len) {
    for (unsigned int i = 0; i < len; i++) {
        bool is_odd = digit_arr[i] % 2 == 1;
        digit_arr[i] /= 2;
        if (is_odd && i + 1 < len) { // and is not last (right) digit
            digit_arr[i + 1] += 10;
        }
    }
}

int main(int argc, char** argv) {
    for (int i = 1; i < argc; i++) {
        unsigned int input_len = strlen(argv[i]);
        char* input = malloc(input_len + 1);
        strcpy(input, argv[i]);
        convert_to_digit_arr(input, input_len);
        char_queue queue = init_char_queue();
        enqueue(&queue, 0); // null terminator to use the queue content as a string
        while (!is_null(input, input_len)) {
            enqueue(&queue, divisible_by_two(input, input_len) ? '0' : '1');
            divide_by_two(input, input_len);
        }
        free(input);
        char* output = queue_get_arr(&queue);
        printf("%s\n", output);
        free(output);
        free_char_queue(&queue);
    }
}

This is not the fastest approach, but it is very simple. Also feel free to optimize it.

Tenobaal
  • 637
  • 1
  • 16
0

How do I convert a really long string (as decimal characters) to binary?

Let us look at printing this.

print2(s)

If the decimal string is at least "2",
__ Divide the decimal string by 2 and notice its remainder.
__ Recursively call print2(s)
__ Print the remainder.
Else print the string.


Example code:

#include  <stdio.h>

unsigned decimal_string_divide(char *dividend, unsigned divisor) {
  // Remove a potential leading '0'
  if (*dividend == '0') {
    memmove(dividend, dividend+1, strlen(dividend));
  }

  // "divide", like we learned in grade school.
  unsigned remainder = 0;
  while (*dividend) {
    unsigned sum = remainder*10 + (*dividend - '0');
    remainder = sum%divisor;
    *dividend = sum/divisor + '0';
    dividend++;
  }

  return remainder;
}

void decimal_string_print_binary(char *dividend) {
  //printf("<%s>\n", dividend); fflush(stdout);
  if (dividend[0]) {
    // If at least 2 digits or at least "2"
    if (dividend[1] || (dividend[0] >= '2')) {
      unsigned bit = decimal_string_divide(dividend, 2);
      decimal_string_print_binary(dividend);
      printf("%c", bit + '0');
    } else {
      printf("%c", *dividend);
    }
  }
}

void decimal_string_print_2(const char *dividend) {
  printf("%-25s", dividend);
  size_t sz = strlen(dividend) + 1;
  char buf[sz];  // Use a VLA or allocate memory
  strcpy(buf, dividend);
  decimal_string_print_binary(buf);
  printf("\n");
}

Test

int main(void) {
  decimal_string_print_2("0");
  decimal_string_print_2("1");
  decimal_string_print_2("42");
  decimal_string_print_2("8675309");
  decimal_string_print_2("18446744073709551615");
}

Output

0                        0
1                        1
42                       101010
8675309                  100001000101111111101101
18446744073709551615     1111111111111111111111111111111111111111111111111111111111111111

To instead convert the string from decimal form into a binary string, allocate sufficient buffer (about log10(2) times string length) and instead of printing above, save to the buffer. Left for OP to do.

chux - Reinstate Monica
  • 143,097
  • 13
  • 135
  • 256
-4

I am suggesting a better approach. whereby any arguments passed to a function that is not intended to be mutated, be received as a "const", and a local pointer be used to access the data of this "const".

IE:

void to_binary(const char *str) {

   char *ptr = str;

   ...

Then use ptr.

I know that in this case, my argument is purely trivial and academic, but it is a good practice to get used to and may save you many headaches in the future.

Also, when dealing with binary data, use "unsigned char", to ensure that no type conversions are used. You will need bit 7 if the data is not ASCII or alike.

sBot
  • 105
  • 1
  • 7
Aaron
  • 3
  • 1
  • 2
    This (1) does not answer the question and (2) is pure nonsense. – n. m. could be an AI Aug 20 '22 at 11:26
  • With regards to your point (2), have a look at: https://stackoverflow.com/questions/117293/use-of-const-for-function-parameters with 502 up votes. – Aaron Aug 20 '22 at 12:21
  • 1
    `char *ptr = str;` will give you a warning. "*warning: initialization discards ‘const’ qualifier from pointer target type [-Wdiscarded-qualifiers]*" gcc 9.4.0 – Tenobaal Aug 20 '22 at 12:59
  • Yes it will - that's the point I was trying to get across! If you meant for *str to NOT be changed, then set it as const so that the compiler will warn you if you try to modify it. – Aaron Aug 20 '22 at 13:33
  • 1
    I recommend to learn the difference between `const char *` and `char * const`, it is essential for understanding the material you have linked. – n. m. could be an AI Aug 20 '22 at 13:44