0

I have written a simple file application to read input data from a file. I have tried to remove space(32) and newline(10). The output is File program

#include <stdio.h>
int main() 
{
    FILE *f;
    char c;
    unsigned char buf[512],i=0;
    printf("in f_read\n");
    f = fopen("S:\\db\\result-enc.txt","rb");
    while( fscanf( f, "%c", &c ) > 0){
        if(c==32 || c==10)
           ;
        else
           buf[i++]=c;
    }
    buf[i]='\0';
    printf("%s\n",buf);
    return 0;
}

data

61 AF AF BF 26 00 66 A6 E6 4B E1 C8 68 20 21 38
AE FD 4C DF 40 39 08 32 82 29 B0 1D FE 17 ED 96
C3 66 6D 4A 12 1C E1 05 84 FF A8 85 C3 87 78 28
8D 43 10 F5 C7 BD 68 F6 11 08 68 DC FF 96 D8 C6
AC 7F 2F 1E 09 EF 80 33 ED 1D 91 CE D7 D8 92 41
58 9D 4F AA C4 9E 28 DD 53 BE E6 69 EC 08 86 3F
41 CB B5 48 1A 60 07 26 0B D5 1D E0 2F A4 B1 2E
23 EC 78 D8 F9 0C E9 FC 61 BD D8 B3 B4 09 CF 9A

Output

409CF9A

What I wanted to see was the whole data as a string stored in buf.

Mohit Jain
  • 30,259
  • 8
  • 73
  • 100
mrigendra
  • 1,472
  • 3
  • 19
  • 33

2 Answers2

1

I assume that i overflows because more than 255 characters are neither newline nor space. (If i is 255, i++ will yield 0 on machines where a char has 8 bits.)

The string "ends" then at the first 0 character after that (or at end of file, whichever comes first) because you do not filter 0 characters out the way you skip newline and space when you copy characters to buf. (It "ends" there as far as printf's %s conversion is concerned, even though many characters may have been written past that position in the buffer.)

Make i an int or size_t, do not copy characters which are '\0' (actually: Follow Klas' advice from a comment to your post and use the portable and exhaustive isprint(), plus isspace(), if you are so inclined). Check for buf's bounds. Simply use getc() for char-by-char reads, not printf.

As far as the output is concerned: printf("%s", ...) is not the appropriate way to display "binary" data (i.e. byte values which include the values < 32, and probably values >127) because these byte sequences will be interpreted by your terminal (form feeds, backspaces overwriting earlier output, escape sequences changing the terminal settings, etc.). Either you filter those out, or you simply display hex data, the same way you showed us the original data.

Peter - Reinstate Monica
  • 15,048
  • 4
  • 37
  • 62
  • I mended the program. In the output I still see newline characters. – mrigendra Feb 23 '16 at 11:38
  • That could be carriage return (13, `\r`). Just use something like `printf("%02x ", (int)buf[i])` for printing hex numbers. I hope I got the format right for printing a leading zero for small values... – Peter - Reinstate Monica Feb 23 '16 at 11:46
  • What is the difference in newline and carriage return? – mrigendra Feb 23 '16 at 11:51
  • A carriage return should position the cursor at the beginning of a line, but not advance it. Terminals were modeled after teletypewriters, and those after typewriters. Advancing a line and returning to the beginning of a line were distinct operations. Because they often occur together they were elegantly combined through the chrome lever to the right (you may remember the Jerry Lewis typewriter sketch). When Unix was conceived on a [9kB machine](https://en.wikipedia.org/wiki/PDP-7) Thompson e.a. decided that one could save valuable memory by omitting the `\r` before `\n`. – Peter - Reinstate Monica Feb 23 '16 at 12:06
  • That is probably the second most severe Unix design error, right after [naming the file create function `creat()`,](https://books.google.de/books?id=poFQAAAAMAAJ&q=%22spell+creat+with+an+e%22&dq=%22spell+creat+with+an+e%22) and probably for similar reasons ;-). – Peter - Reinstate Monica Feb 23 '16 at 12:18
1
  1. Here:

    unsigned char buf[512], i=0;
    

    i should be an integer.

  2. Do not copy '\0' characters, otherwise printing string will always end up on first '\0' found.

Spikatrix
  • 20,225
  • 7
  • 37
  • 83
4pie0
  • 29,204
  • 9
  • 82
  • 118
  • As for point 2, the OP doesn't copy `\0`. He does it only after the loop as it is required. – Spikatrix Feb 23 '16 at 12:28
  • @CoolGuy I thought he did, and the short output seems to support that (unless it's just EOF, but the chances are 1/20 or so). I think `%c` reads a 0 *sans probleme*, and it's not explicitly skipped either. (And the last `'\0'` is not a copy , btw.) – Peter - Reinstate Monica Feb 23 '16 at 12:30
  • @PeterA.Schneider Isn't the short output because of the overflow of `i`? – Spikatrix Feb 23 '16 at 12:33
  • @CoolGuy We can't know (because I couldn't find the output in the data, so some of the data is not revealed to us, unless I am missing it). But that EOF happens within the first 8 bytes of a 512 byte buffer is fairly unlikely, assuming an even probability for some range of data lengths. – Peter - Reinstate Monica Feb 23 '16 at 12:40