3

The program requires an input of an arbitrary large unsigned integer which is expressed as one string in base 10. The outputs is another string that expresses the integer in base 16.

For example, the input is "1234567890987654321234567890987654321234567890987654321", and the output shall be "CE3B5A137DD015278E09864703E4FF9952FF6B62C1CB1"

The faster the algorithm the better.

It will be very easy if the input is limited within 32-bit or 64-bit integer; for example, the following code can do the conversion:

#define MAX_BUFFER 16
char hex[] = "0123456789ABCDEF";

char* dec2hex(unsigned input) {
    char buff[MAX_BUFFER];
    int i = 0, j = 0;
    char* output;

    if (input == 0) {
        buff[0] = hex[0];
        i = 1;
    } else {
        while (input) {
            buff[i++] = hex[input % 16];
            input = input / 16;
        }
    }

    output = malloc((i + 1) * sizeof(char));
    if (!output) 
        return NULL;

    while (i > 0) {
        output[j++] = buff[--i];        
    }
    output[j] = '\0';

    return output;
}

The real challenging part is the "arbitrary large" unsigned integer. I have googled but most of them are talking about the conversion within 32-bit or 64-bit. No luck is found.

Can anyone give any hit or any link that can be read on?

Thanks in advance.

Edit This is an interview question I encountered recently. Can anyone briefly explain how to solve this problem? I know there is a gmp library and I utilized it before; however as an interview question it requires not using external library.

yinyueyouge
  • 3,684
  • 4
  • 25
  • 22

8 Answers8

14
  1. Allocate an array of integers, number of elements is equal to the length of the input string. Initialize the array to all 0s.

    This array of integers will store values in base 16.

  2. Add the decimal digits from the input string to the end of the array. Mulitply existing values by 10 add carryover, store new value in array, new carryover value is newvalue div 16.

    carryover = digit;
    for (i = (nElements-1); i >= 0; i--)
    {
        newVal = array[index] * 10) + carryover;
        array[index] = newval % 16;
        carryover = newval / 16;
    }
    
  3. print array, start at 0th entry and skip leading 0s.


Here's some code that will work. No doubt there are probably a few optimizations that could be made. But this should suffice as a quick and dirty solution:

#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include "sys/types.h"

char HexChar [16] = { '0', '1', '2', '3', '4', '5', '6', '7',
                      '8', '9', 'A', 'B', 'C', 'D', 'E', 'F' };

static int * initHexArray (char * pDecStr, int * pnElements);

static void addDecValue (int * pMyArray, int nElements, int value);
static void printHexArray (int * pHexArray, int nElements);

static void
addDecValue (int * pHexArray, int nElements, int value)
{
    int carryover = value;
    int tmp = 0;
    int i;

    /* start at the bottom of the array and work towards the top
     *
     * multiply the existing array value by 10, then add new value.
     * carry over remainder as you work back towards the top of the array
     */
    for (i = (nElements-1); (i >= 0); i--)
    {
        tmp = (pHexArray[i] * 10) + carryover;
        pHexArray[i] = tmp % 16;
        carryover = tmp / 16;
    }
}

static int *
initHexArray (char * pDecStr, int * pnElements)
{
    int * pArray = NULL;
    int lenDecStr = strlen (pDecStr);
    int i;

    /* allocate an array of integer values to store intermediate results
     * only need as many as the input string as going from base 10 to
     * base 16 will never result in a larger number of digits, but for values
     * less than "16" will use the same number
     */

    pArray = (int *) calloc (lenDecStr,  sizeof (int));

    for (i = 0; i < lenDecStr; i++)
    {
        addDecValue (pArray, lenDecStr, pDecStr[i] - '0');
    }

    *pnElements = lenDecStr;

    return (pArray);
}

static void
printHexArray (int * pHexArray, int nElements)
{
    int start = 0;
    int i;

    /* skip all the leading 0s */
    while ((pHexArray[start] == 0) && (start < (nElements-1)))
    {
        start++;
    }

    for (i = start; i < nElements; i++)
    {
        printf ("%c", HexChar[pHexArray[i]]);
    }

    printf ("\n");
}

int
main (int argc, char * argv[])
{
    int i;
    int * pMyArray = NULL;
    int nElements;

    if (argc < 2)
    {
        printf ("Usage: %s decimalString\n", argv[0]);
        return (-1);
    }

    pMyArray = initHexArray (argv[1], &nElements);

    printHexArray (pMyArray, nElements);

    if (pMyArray != NULL)
        free (pMyArray);

    return (0);
}
hajikelist
  • 1,136
  • 9
  • 9
  • Nice solution. One "optimization" relative to memory use would be to use bytes (char or unsigned char) for each digit instead of full integers. – Tall Jeff May 14 '09 at 12:35
  • Most interviewers that have asked me this sort of question wanted me to come up with a destructive solution with no extra allocation (after I presented a solution such as this). Is that possible for this question? – Merlyn Morgan-Graham May 01 '11 at 23:05
  • Our natural instinct is to store numbers in base 10 because that's the way our minds work. I like the way you switched that up to store in base 16 instead. – Mark Ransom Apr 18 '22 at 15:48
  • does this solution has a particular algorithm name?, Is it one of those algorithms that belongs to a particular class/types of algorithms? if so what do you call them? – 0xdeadbeef May 06 '22 at 12:00
4

I have written an article which describes a simple solution in Python which can be used to transfrom a series of numbers from and to arbitrary number bases. I've originally implemented the solution in C, and I didn't want a dependency to an external library. I think you should be able to rewrite the very easy Python code in C or whatever you like.

Here is the Python code:

import math
import string

def incNumberByValue(digits, base, value):
   # The initial overflow is the 'value' to add to the number.
   overflow = value
   # Traverse list of digits in reverse order.
   for i in reversed(xrange(len(digits))):
      # If there is no overflow we can stop overflow propagation to next higher digit(s).
      if not overflow:
         return
      sum = digits[i] + overflow
      digits[i] = sum % base
      overflow = sum / base

def multNumberByValue(digits, base, value):
   overflow = 0
   # Traverse list of digits in reverse order.
   for i in reversed(xrange(len(digits))):
      tmp = (digits[i] * value) + overflow
      digits[i] = tmp % base
      overflow = tmp / base

def convertNumber(srcDigits, srcBase, destDigits, destBase):
   for srcDigit in srcDigits:
      multNumberByValue(destDigits, destBase, srcBase)
      incNumberByValue(destDigits, destBase, srcDigit)

def withoutLeadingZeros(digits):
   for i in xrange(len(digits)):
      if digits[i] != 0:
         break
   return digits[i:]

def convertNumberExt(srcDigits, srcBase, destBase):
   # Generate a list of zero's which is long enough to hold the destination number.
   destDigits = [0] * int(math.ceil(len(srcDigits)*math.log(srcBase)/math.log(destBase)))
   # Do conversion.
   convertNumber(srcDigits, srcBase, destDigits, destBase)
   # Return result (without leading zeros).
   return withoutLeadingZeros(destDigits)


# Example: Convert base 10 to base 16
base10 = [int(c) for c in '1234567890987654321234567890987654321234567890987654321']
base16 = convertNumberExt(base10, 10, 16)
# Output list of base 16 digits as HEX string.
hexDigits = '0123456789ABCDEF'
string.join((hexDigits[n] for n in base16), '')
Jonny Dee
  • 837
  • 4
  • 12
2

The real challenging part is the "arbitrary large" unsigned integer.

Have you tried using GNU MP Bignum library?

dirkgently
  • 108,024
  • 16
  • 131
  • 187
  • yeah i know gmp and used it before; can this problem be solved without using gmp? Or, can you briefly tell how the gmp is designed since the code base is quite large – yinyueyouge May 14 '09 at 05:13
1

You can try this arbitrary length input C99 base_convert (between 2 and 62) function :

#include <stdlib.h>
#include <string.h>

static char *base_convert(const char * str, const int base_in, const int base_out) {
    static const char *alphabet = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz";
    size_t a, b, c = 1, d;
    char *s = malloc(c + 1);
    strcpy_s(s, c + 1, "0");
    for (; *str; ++str) {
        for (a = (char*)memchr(alphabet, *str, base_in) - alphabet, b = c; b;) {
            d = ((char *)memchr(alphabet, s[--b], base_out) - alphabet) * base_in + a;
            s[b] = alphabet[d % base_out];
            a = d / base_out;
        }
        for (; a; s = realloc(s, ++c + 1), memmove(s + 1, s, c), *s = alphabet[a % base_out], a /= base_out);
    }
    return s;
}

Try it Online - Example usage :

#include <stdio.h>

int main() {
    char * res = base_convert("12345678909876543212345678909876"
                              "54321234567890987654321", 10, 16);
    puts(res);
    free(res);

    // print CE3B5A137DD015278E09864703E4FF9952FF6B62C1CB1
}

Example output :

'11100100100011101011001001110110001101001001100010100001111011110011000010'
 from base 2 to base 58 is 'BaseConvert62'.

'NdN2mbALtnCHH' from base 60 to base 59 is 'StackOverflow'.

Tested with your example and Fibonacci(1500000).

Thank You.

Community
  • 1
  • 1
Michel
  • 259
  • 2
  • 3
1

Here's a BigInt library:

http://www.codeproject.com/KB/cs/BigInt.aspx?msg=3038072#xx3038072xx

No idea if it works, but it's the first one I found with Google. It appears to have functions to parse and format big integers, so they may support different bases too.

Edit: Ahh, you're using C, my mistake. But you may be able to pick up ideas from the code, or someone using .NET may have the same question, so I'll leave this here.

Daniel Earwicker
  • 114,894
  • 38
  • 205
  • 284
1

Unix dc is able to do base conversions on arbitrary large integers. Open BSD source code is available here.

mouviciel
  • 66,855
  • 13
  • 106
  • 140
0

Python:

>>> from string import upper
>>> input = "1234567890987654321234567890987654321234567890987654321"
>>> output = upper(hex(int(input)))[2:-1]
>>> print output
CE3B5A137DD015278E09864703E4FF9952FF6B62C1CB1
Stan Graves
  • 6,795
  • 2
  • 18
  • 14
  • 1
    The OP stated this was an interview question. When faced with a "dumb" interview question, I (almost) always respond with a "dumb" answer. It tends to start the next part of the conversation. The OP chose C as the language...but never mentioned that was "required" in the answer. Most interview questions that are "algorithmic" in nature tend to be implementation independent. The questions use of "arbitrarily large" on the input implies (to me) that the solution will be based on text/char manipulation...so I picked a language with reasonable text manipulation builtins. – Stan Graves Aug 11 '13 at 03:46
  • In my opinion, the spirit of the question and the tags given imply that an acceptable solution will not make use of an arbitrarily sized numeric type, whether it is built in, or a library. The OP's question also makes this restriction clear. If your suggestion is to rebel against "dumb" interview questions, then that's a comment rather than an answer. If your suggestion is to start with a valid, simple, real-world solution, then work through more restrictive solutions (e.g. you need this utility to work on a microcontroller w/ just a C runtime, and no libraries), then this is only step 1 of 3. – Merlyn Morgan-Graham Aug 12 '13 at 20:23
  • @MerlynMorgan-Graham At the same point, the question didn't really ask for any particular language. – ArtOfWarfare Jan 27 '14 at 06:08
  • @ArtOfWarfare: It says C in the tags. The statement "however as an interview question it requires not using external library" implies the level of abstraction that they're looking for, which when talking about C is effectively pointer arithmetic on processor bytes or words. To further pronounce my ability to read the mind of the interviewer (/tongue-in-cheek), they ask questions like this specifically to make sure you "get" pointers. – Merlyn Morgan-Graham Jan 27 '14 at 18:24
  • @MerlynMorgan-Graham - I agree you're right about what the OP needed, but there's no need to down vote this answer. Many people, like myself, will arrive at this when searching for an algorithm for converting bases. I wasn't interested in the details of the language so much as to see ideas for how to do this when you're not sure how long the number will be. A searcher who is using Python will likely find this useful - regrettably, my project is in Obj-C, not Python. – ArtOfWarfare Jan 27 '14 at 18:29
  • @ArtOfWarfare: I was more aggressive back then. I fully agree with you now, and actually went to go remove my downvote before your last comment here :) It's unfortunately locked in until Stan edits it. – Merlyn Morgan-Graham Jan 27 '14 at 18:30
0

Here is the above-mentioned algorithm implemented in javascript:

function addDecValue(hexArray, value) {
  let carryover = value;
  for (let i = (hexArray.length - 1); i >= 0; i--) {
    let rawDigit = ((hexArray[i] || 0) * 10) + carryover;
    hexArray[i] = rawDigit % 16;
    carryover = Math.floor(rawDigit / 16);
  }
}
    
function toHexArray(decimalString) {
  let hexArray = new Array(decimalString.length);
  for (let i = 0; i < decimalString.length; i++) {
    addDecValue(hexArray, Number(decimalString.charAt(i)));
  }
  return hexArray;
}

function toHexString(hexArray) {
  const hexDigits = ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 'A', 'B', 'C', 'D', 'E', 'F'];
  let result = '';
  for (let i = 0; i < hexArray.length; i++) {
    if (result === '' && hexArray[i] === 0) continue;
    result += hexDigits[hexArray[i]];
  }
  return result
}
    
toHexString(toHexArray('1234567890987654321234567890987654321234567890987654321'));
0xdeadbeef
  • 500
  • 3
  • 17