0

I am writing a program which convert decimal number to roman number. I use 4 arrays thousands, hundreds, tens, units to store digits in roman number, then copy each digit to res array, I use str pointer to track where the string in res begin. When I test with input 128, it prints out CXXVIIIIX which must be CXXVIII. I have tried to debug and got the result when tmp=8 is strlen(units[tmp-1]) = 6, which means strlen also counts IX. And for some case like 3888, the program prints out trash value.

This is my code

#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include <stdbool.h>
#include <string.h>
#include <windows.h>

int main(){
    int n=128;

    char thousands[][4]={"M","MM","MMM"};
    char hundreds[][5]={"C","CC","CCC","CD", "D", "DC", "DCC", "DCCC", "CM"};
    char tens[][5]={"X", "XX", "XXX", "XL", "L", "LX", "LXX", "LXXX", "XC"};
    char units[][5]={"I", "II", "III", "IV", "V", "VI", "VII", "VIII", "IX"};

    char res[16];
    res[15]='\0';    //add nul to the last character of res array
    char *str=res+14;   //str tracks where the string start
    int tmp;
    
    
    //store roman digits in res array in reverse order, 
    //start from units->tens->hundreds->thousands
    if (n!=0)
    {
        tmp=n%10;
        if (tmp!=0)
        {
            str-=strlen(units[tmp-1]);  //str steps back several address to store new digits
            strncpy(str, units[tmp-1], strlen(units[tmp-1]));   //use strncpy to copy without '\0'
            
        }
        n/=10;
    }
    
    if (n!=0)
    {
        tmp=n%10;
        if (tmp!=0)
        {
            str-=strlen(tens[tmp-1]);
            strncpy(str, tens[tmp-1], strlen(tens[tmp-1]));
            
        }
        n/=10;
    }
    
    if (n!=0)
    {
        tmp=n%10;
        if (tmp!=0)
        {
            str-=strlen(hundreds[tmp-1]);
            strncpy(str, hundreds[tmp-1], strlen(hundreds[tmp-1]));
            
        }
        n/=10;
    }
    
    if (n!=0)
    {
        tmp=n%10;
        if (tmp!=0)
        {
            str-=strlen(thousands[tmp-1]);
            strncpy(str, thousands[tmp-1], strlen(thousands[tmp-1]));
            
        }
        n/=10;
    }
    
    printf("%s", str);
    return 0;
}

So why does this happen and how to fix it?

Any help would be appreciated.

Becker
  • 147
  • 7
  • 6
    The string `"LXXX"` needs *five* characters to include the null-terminator! You have the same problem in the other arrays as well. – Some programmer dude Jan 10 '22 at 10:23
  • There's also a case where [`strncpy`](https://en.cppreference.com/w/c/string/byte/strncpy) will not add the null-terminator. If you don't have a null-terminator in your string, it can't be used by the standard C string functions as they will go out of bounds and give you *undefined behavior*. – Some programmer dude Jan 10 '22 at 10:25
  • @Someprogrammerdude I have change the boundary of `units`, `tens`, `hundreds` to `5`, `thousands` to `4` and it solve for `n=128`, but when `n=3888` it still prints out trash value. How do I solve it? – Becker Jan 10 '22 at 10:31
  • @Someprogrammerdude I use `strncpy` to not copy the '\0' to `res`, and I also add '\0' at `res[15]` at the beginning. So is it good enough? – Becker Jan 10 '22 at 10:35
  • 3
    `strncpy` is a dangerous function that shouldn't be used for any purpose. See [Is strcpy dangerous and what should be used instead?](https://software.codidact.com/posts/281518) – Lundin Jan 10 '22 at 10:55
  • 1
    @Becker Please [edit] your question and copy&paste your modified code instead of describing what you changed. You have to add `'\0'` immediately after the last valid character. If your string has less than 15 characters, `res[15]='\0'`; is not correct. You could use `res[15]={0};` to fill all elements of `res`. – Bodo Jan 10 '22 at 10:55
  • correction: `char res[16] = {0};` would initialize the whole array with 0. – Bodo Jan 10 '22 at 11:08

2 Answers2

2

After fixing the array size for all string literals, there is an off-by-one error and an uninitialized char array element.

  • char res[16]; would be enough for "MMMDCCCLXXXVIII" with a trailing '\0'.
  • With res[15]='\0'; you store the string termination in the last element.
  • char *str=res+14; sets the pointer to the uninitialized character before this.
  • str-=strlen(something) moves the pointer according to the length of the string to insert. The following strncpy will not overwrite the uninitialized character.

The result will always contain an uninitialized trailing character which may or may not be visible.

A result of the maximum length ("MMMDCCCLXXXVIII") will begin one character before the first array element because of the trailing uninitialized character. You would have a result string "MMMDCCCLXXXVIII*" where * is uninitialized. The second M is the first array element of res, the first M would be one before the first element.

Example to demonstrate the off-by-one error:

Based on the original (wrong) code from the question.

Note that this is not the full code but a reduced extract intended to show how the variable values change with input value n=3888. (To make the code shorter, the if (tmp!=0) guards from the original code are missing here.)

In the comments that show variable values, * is used to indicate an uninitialized character.

int n=3888;

char res[16];
// * indicates an uninitialized character
res[15]='\0';  // "***************\0"

// The following initialization is wrong.
char *str=res+14; // str = &res[14] (the last uninitialized character)
int tmp;

// The following block leaves the last character of res uninitialized.
tmp=n%10; // tmp = 8
str-=strlen(units[tmp-1]); // "VIII" (strlen = 4) -> str = &res[10]
strncpy(str, units[tmp-1], strlen(units[tmp-1])); // "**********VIII*\0"
n/=10; // n = 388

// Here, str points to the first character written by the previous block.
tmp=n%10; // tmp = 8
str-=strlen(tens[tmp-1]); // "LXXX" (strlen = 4) -> str = &res[6]
strncpy(str, tens[tmp-1], strlen(tens[tmp-1])); // ******LXXXVIII*\0"
n/=10; // n = 38

/* ...*/

// final result would be

// str = &res[-1]
// str -> "MMMDCCCLXXXVIII*\0"
// res =   "MMDCCCLXXXVIII*\0"

// The last strncpy tries to write a character 'M' one 
// element before the beginning of the array res
// corresponding to strncpy(res[-1], "MMM", 3)

As a fix I propose

char res[16] = {0}; // initialize the whole array with 0
char *str=res+(sizeof(res)-1);
Bodo
  • 9,287
  • 1
  • 13
  • 29
1

... and how to fix it?

  • Consider writing units in 1,000s, then 100s, tens, ones order.

  • Add a "zero" to each unit list.

  • Roll into a helper function.

  • Change types from character arrays to arrays of pointers.

Example

void print_roman(int n) {
  static const char *thousands[] = {"", "M", "MM", "MMM"};
  static const char *hundreds[] = {"", "C", "CC", "CCC", "CD", "D", "DC", "DCC", "DCCC", "CM"};
  static const char *tens[] = {"", "X", "XX", "XXX", "XL", "L", "LX", "LXX", "LXXX", "XC"};
  static const char *ones[] = {"", "I", "II", "III", "IV", "V", "VI", "VII", "VIII", "IX"};
  static const char **units[] = {thousands, hundreds, tens, ones};
  int units_n = sizeof units / sizeof units[0];

  assert(n > 0 && n < 4000);
  char res[16];
  char *p = res;
  int multiplier = 1000;
  for (int i = 0; i < units_n; i++) {
    strcpy(p, units[i][n / multiplier]);
    p += strlen(p);
    n %= multiplier;
    multiplier /= 10;
  }
  printf("<%s>\n", res);
}

int main() {
  print_roman(3888);
  print_roman(3999);
  print_roman(1010);
  print_roman(42);
}

Output

<MMMDCCCLXXXVIII>
<MMMCMXCIX>
<MX>
<XLII>
chux - Reinstate Monica
  • 143,097
  • 13
  • 135
  • 256