Tabs converter C

Question

I am trying to make a program that converts 4 spaces to tabs. But I have some bug I can't find.

Here is the code:

#include <stdio.h>
#define TABVALUE 4
#define ARRAYSIZE 3
int c, d, s;
int savedChars[ARRAYSIZE];

void emptyArray(int *a);

int main(void) {
    c = s = d = 0;
    while ((c = getchar()) != EOF) {
        if (c == ' ') {
            s = 0;
            s++;
            for (int j = 0; j < TABVALUE - 1; j++) {
                d = getchar();
                if (d != EOF) {
                    savedChars[j] = d;
                    savedChars[j + 1] = '\0';
                    if (d == ' ') {
                        s++;
                    } else {
                        break;
                    }
                }
            }
            if (s == TABVALUE) {
                emptyArray(savedChars);
                putchar('\t');
                s = 0;
            } else {
                putchar(c);
                for (int i = 0; i < 3; i++) {
                    if (savedChars[i] != '\0') {
                        putchar(savedChars[i]);
                    }
                }
            }
        } else {
            if (c != EOF) putchar(c);
        }
    }
    return 0;
}

void emptyArray(int *a) {
    for (int i = 0; i < ARRAYSIZE; i++) {//The bug is not in this function, i guess
        a[i] = '\0';
    }
}

Input:

2 spaces then character d 4 spaces then character d

Output:

2 spaces then character d 1 space then character d

But when I just add putchar(s) or anything like printf("a") before:

if (s == TABVALUE) {
                emptyArray(savedChars);
                putchar('\t');
                s = 0;
            }

than the output is this:

a 2 spaces d a tab d

Why is tab working when I print something? I am really confused...

`void emptyArray(int *a){ ... sizeof( a )...` And what is `sizeof(int *)`? — Andrew Henle, Oct 02 '16 at 15:22
Your `emptyArray` function would not work as `sizeof(a)` would always return 4 or 8 depending on the bit width of your OS. — alk, Oct 02 '16 at 15:23
Possible duplicate of [c, finding length of array inside a function](http://stackoverflow.com/questions/17590226/c-finding-length-of-array-inside-a-function) — Andrew Henle, Oct 02 '16 at 15:23
So just replace `emptyArray(savedChars);` with `memset(savedChars, 0, sizeof savedChars);` — alk, Oct 02 '16 at 15:24
@AndrewHenle How is it possible to be duplicate ?? I didnt asked you to write entire code for me.Just to find my bug in **my code**.Since its my code its not possible to be duplicate — Silidrone, Oct 02 '16 at 15:26
But i am still getting the same error.Even if i change the emptyArray function. — Silidrone, Oct 02 '16 at 15:29
Please *add* the change as update to your question. Do not delete any parts of the original code when doing so. — alk, Oct 02 '16 at 15:31
[Read this](http://stackoverflow.com/help/on-topic): Questions *seeking debugging help ("why isn't this code working?")* must include the desired behavior, a specific problem or error and the shortest code necessary to reproduce it in the question itself. Questions without a clear problem statement are not useful to other readers. See: [How to create a Minimal, Complete, and Verifiable example.](http://stackoverflow.com/help/mcve) — Andrew Henle, Oct 02 '16 at 15:31
So i cant post simple code with bug and ask anyone to find it ?? Because i did it before on stackoverflow,and the code was more complicated than this one.And no one said this to me.They just found the bug. — Silidrone, Oct 02 '16 at 15:34

score 2 · Accepted Answer · edited Jun 20 '20 at 09:12

Now you know that JetBrains doesn't always help, you should learn about tools like od -c or xxd -g 1 or some similar byte dump program so that you can see exactly what's in the output and avoid running into problems that turn out to be in your environment rather than the program you show on SO.

If I have some input text in input.txt. Do I do it like this?
./a.out<input.txt | od -cb

Yes. You might get output like this:

$ ./a.out <input.txt | od -cb
0000000            d  \t   d
          040 040 144 011 144
0000005
$

The leading 0000000 is the byte offset in the file. This is followed by two spaces (not very visible), a letter d, a tab and another d with no newline. The trailing 0000005 indicates that the total length of the file is 5 bytes. If you have a file with kilobytes or megabytes of data, the offset can be important. If you're trying to understand the binary structure of a file, the offsets can be important. For a 5-byte file, they're not very relevant.

The second line of triple digits is the octal representation of each character shown in the previous line (triggered by the -b option). Thus 040 is 32 decimal, or 0x20 hexadecimal, and is the code for a space; 144 corresponds to d, and 011 or 9 decimal corresponds to tab \t.

The output from xxd is loosely equivalent:

$ ./a.out < input.txt | xxd -g 1
00000000: 20 20 64 09 64                                     d.d
$

It shows the hex codes instead of the octal codes, and prints two spaces, a d, a dot (because the tab is not a graphic character) and another d.

Tools like these can be very useful for telling you about what data is in a file. There are other tools that can help, including sed l (equivalent to sed -e 'l') on all platforms, and cat -v on some platforms. Different tools have different emphases.

score 0 · Answer 2 · answered Oct 02 '16 at 15:58

0

It has something to do with JetBrains.I was looking in output.txt(the file i saved output) in JetBrains.And it was space.But when i looked in normal notepad it was tab.Thats the only problem.I think there is no problem in code.

answered Oct 02 '16 at 15:58

Silidrone

1,471
4
20
35

1

Now you know that JetBrains (whatever that is) doesn't always help, you should learn about tools like `od -c` or `xxd -g 1` or some similar byte dump program so that you can see exactly what's in the output and avoid running into problems that turn out to be in your environment rather than the program you show on SO. Stuff happens; don't worry too much about it this time, but please learn for the future. – Jonathan Leffler Oct 02 '16 at 17:28
Ok so i have some input text in input.txt. Do i do it like this `./a.out – Silidrone Oct 03 '16 at 15:39
I get wierdish things `0000000 d \t d 040 040 144 011 144 0000005` – Silidrone Oct 03 '16 at 15:39
1

I didn't have the `-b` option to `od` in mind, but if that's what you want to use, it is fine. SO Markdown replaces multiple spaces with one space in a comment. Your file seems to contain 5 bytes; space, space, d, tab, d (no newline at the end). Using a tool like that shows you the bytes in your 'file' (output, whatever) and can help you understand what is going on. – Jonathan Leffler Oct 03 '16 at 15:49
Why it prints `0000000` at the start ?? Why it prints `040 040 144 011 144` ?? And the only thing i think i understand is at the end its bytes of file the `0000005`. So could you explain me the above questions ?? – Silidrone Oct 03 '16 at 15:55
1

The 0000000 at the front is the byte offset into the file. The 00000005 at the end is the offset of the end. If you have a file with kilobytes or megabytes of data, the offset can be important. If you're trying to understand the binary structure of a file, the offsets can be important. For a 5-byte file, they're not very relevant. Try doing `od -cb a.out`, for example. – Jonathan Leffler Oct 03 '16 at 15:57
Thanks for the explanation.And i guess the `040 040 144 011 144` numbers are Octal ASCII Code of d,space and tab.040 for Space, 144 for d and 011 for tab.So turns out,my code works fine :) Really helpful comment ! You can even post it as answer if you want :) – Silidrone Oct 03 '16 at 16:10

Tabs converter C

2 Answers2