1

I came across this strange loop behaviour in C. Here is the code:

char arr[SIZE];
fgets(arr, SIZE, stdin);
char brr[SIZE];
int i;
for(i=0; arr[i] != '\0'; i++){
    if(arr[i] != ' ') brr[i] = (arr[i] + shift - 'a')%26 + 'a';
    else brr[i] = ' ';
}
brr[i-1] = '\0';

The code is for Caesar cipher. It takes some text from the user and stores it in arr, then encodes it by shifting the characters, and this encoded version is put in brr. So, if user enters 'car' and shift is 2, brr is 'ect'.

The problem is in the last line. Normally, it should be brr[i] = '\0', but for some reason the loop will increase I once more after the condition is wrong. I have tried different combinations and it seems like the problem is somehow related to fgets.

Also, the problem is not specific to for loop, while behaves exactly the same.

Edit: Thanks for all the answers. I see now that I should have taken '\n' into account.

ahmet
  • 13
  • 4
  • 3
    This code does not account for the `'\n'` that `fgets()` also puts into the buffer. It attempts to "encipher" the newline character as if it were an alphabetic character. – Fe2O3 Jan 27 '23 at 20:23
  • You want `brr[i] = '\0'`. The loop ends when `arr[i] == 0`, and `i` does not increment at that point. – Kaz Jan 27 '23 at 20:24
  • 1
    "_The problem is in the last line_" In fact, the problem is presuming the plaintext is only lowercase alpha and SP. There's no provision for UPPERCASE, punctuation or digits, for instance... – Fe2O3 Jan 27 '23 at 20:33
  • @Fe2O3 I was trying to point where the problem shows itself, but you are right, I should had put it more clearly. – ahmet Jan 27 '23 at 22:09

1 Answers1

1

somehow related to fgets.

Always test the return values of input functions.

// fgets(arr, SIZE, stdin);
if (fgets(arr, SIZE, stdin)) {
  // process arr[]
  ...

if user enters 'car' and shift is 2,

This is unlikely to only typed 'car' (3 characters). Certainly the user typed c a r Enter (4 characters) which filled arr[SIZE] with "car\n".

Rather than encode the 4 characters with if(arr[i] != ' '), test if arr[i] is a letter. Research isalpha().

Code could exit the for() loop early if '\n' is detected.

// for(i=0; arr[i] != '\0'; i++){
for(i=0; arr[i] != '\0' && arr[i] != '\n'; i++){
  ...
}
// brr[i-1] = '\0';
brr[i] = '\0';

Advanced

brr[i-1] = '\0'; is certainly a problem if i==0, possible if the first character read is a null character.

chux - Reinstate Monica
  • 143,097
  • 13
  • 135
  • 256
  • Thanks for this great answer. 'isalpha()' is also a really necessary part. But advanced part is automatically solved when `&& arr[i] != '\n'` is added, isn't it? – ahmet Jan 27 '23 at 23:04
  • @ahmet Some insight in [short-circuit evaluation](https://stackoverflow.com/questions/45848858/what-is-short-circuit-evaluation-in-c) will do a lot of good. It's just a regular `for` loop that breaks as soon as a condition evaluates `false`. – Ted Lyngmo Jan 28 '23 at 01:16
  • @ahmet `brr[i-1] = '\0';` remains a problem. Why would you want to remove the `char` before the `'\n'` when `arr[i] != '\n'` is true? Further, are you assuming the buffer from `fgets()` always has a `'\n'` in it? – chux - Reinstate Monica Jan 28 '23 at 03:23
  • @chux-ReinstateMonica I meant that now that i has the correct value when it leaves the loop, I change `brr[i-1] = '\0'` to `brr[i] = '\0'`. So it no longer would try to reach `brr[i-1]` if the first element is null. On `fgets()`, well, I will always have either `\0` or `\n` by the definiton of `fgets()`, won't I? – ahmet Jan 28 '23 at 11:21
  • @ahmet When `fgets()` returns non-`NULL`, the buffer will always have a `'\0'`. It might have `'\n'` too. It often will. – chux - Reinstate Monica Jan 28 '23 at 11:23
  • @chux-ReinstateMonica Then, I am good to go as long as I check the return by `fgets()` as you mentioned in the original answer. Thanks for all your efforts, chux and Ted. – ahmet Jan 28 '23 at 11:52