1

I was programming on CLion 2021.1.2 and I received some unrecognized characters in the output of my program that involved the usage of the getchar() function and strings. The goal of the program was to copy the input, replace one or more blanks (i.e. ) that are placed together with just one blank and then print the output. The output string contained some unrecognized characters in the form of diamond-boxed question marks, which I didn't understand why. Below is my code and two sample input-output pairs for reference:

My Code:

#include <stdio.h>

int main() {

  int c, i = 0;
  char s[100], g; // i am restricting the length of the string to 100

  while ((c = getchar()) != EOF) {
      if (i == 0)
      {
          i++;
          g = (char) c;
          s[0] = (char) c;
          continue;
      }
      if ((c == ' ' ) && (g == ' ')) 
      {
          continue;
      }

      s[i] = (char) c;
      g = (char) c;
      i++;
  }
  printf("%s\n", s);
  return 0;
}

Input 1:

Hello, This is me.     Welcome
Hi      Hello hello
Just Kidding   This is me
123  456 789 111^D

Output 1:

Hello, This is me. Welcome
Hi Hello hello
Just Kidding This is me
��������������������������������������B

Input 2:

123 456   789 abc
\n \t 123 145 *&$&)$@
1234567805018308513
^D

Output 2:

123 456 789 abc
\n \t 123 145 *&$&)$@
1234567805018308513
����������������������������������������������:

The ^D in the input indicates my use of Ctrl+D for EOF to be read by getchar().
As expected, the extra blanks have been removed from the input while returning the output, but these unrecognized characters also get printed, which confuses me.
In these unrecognized characters, the number of characters seem to be changing, and the last character (the one right after all the diamond-boxed question marks) is a recognized character, but is also unnecessary.

I have a few questions regarding this:

  1. Why is this happening? Could this be anything to do with the length of the string the restriction I placed as 100?
  2. Is this something to do with the IDE, or my algorithm?
  3. How exactly does the copying of values take place? Are additional characters present in the functioning of string addition?
  4. Can this problem be rectified with the help of another method or function?

Thank you, any help regarding this would be appreciated.

rev0
  • 35
  • 4
  • Strings in C need to be NUL terminated. Add `s[i] = '\0';` before the `printf`. – kaylum Jun 11 '21 at 03:54
  • Are you sure you should be using `Ctrl+D`? Refer to https://stackoverflow.com/questions/4358728/end-of-file-eof-in-c/4358765#4358765 – Kitswas Jun 11 '21 at 04:01
  • @kaylum Thank you, I did that, but I still get one unrecognized character in the output, irrespective of the length of the input. The additional recognized character as mentioned in my question has been removed too, when the string is NULL Terminated. – rev0 Jun 11 '21 at 04:14
  • @PalLaden `Ctrl+D` seems to be the only key pair that responds to my input and yes, it works well by just ending input just at that point. In the link you have provided, the answer to that questions says that it's `Ctrl+D` for unix based systems and `Ctrl+Z` for Windows systems. I use a Windows system and `Ctrl+Z` does nothing. Maybe it's something to do with the IDE? – rev0 Jun 11 '21 at 04:17

1 Answers1

1

As @kaylum pointed out, you absolutely need to terminate your string before printing it. As good practice, you might also want to give your variables meaningful names. Also, the use of continue's is not needed when else's will work equally well. In addition, since you have a limited-length string, it's good practice to do a bounds-check. Perhaps you want something like this:

int main()
{
    int chr, idx = 0;
    char str[100], last_chr = '\0';
    
    while ((chr = getchar()) != EOF  &&  idx < 99) {
        if (last_chr != ' '  ||  chr != ' ')
            last_chr = str[idx++] = chr;
    }
    str[idx] = '\0';
    printf("%s\n", s);
    return 0;
}

Note that initializing last_chr (your old g) to a non-space value eliminates the need for another test in your loop.

BTW, the diamond-question-mark character is a graphic printed when a character is not in your system font.

SGeorgiades
  • 1,771
  • 1
  • 11
  • 11
  • Thank you, this was well explained and it solved my problem, I've understood it. I think I made an error by incrementing i by 1 (using `i++`) and then using `str[idx] = '\0'` after trying out what @kaylum commented earlier. Also, thanks for explaining the diamond-question-mark character. – rev0 Jun 11 '21 at 05:37