-1

I am currently trying to read a file and calculate the frequencies of 1 byte equivalent numbers (0 to 255). I want to do the same for 2 byte equivalent numbers (0 to 65535)

Simplified version of what I have:

int length = 256; //any value 256>
long long values[length]
char buffer[length]
int i,nread;

fileptr = fopen("text.txt", "rb");

for (i=0; i<length; i++){ values[i]=0 }
while((nread = fread(buffer, 1, length, fileptr)) > 0){
   for(i=0;i<nread;i++){
      values[(unsigned char)buffer[i]]++;
   }
}

fclose(fileptr);

for(i=0;i<length;i++{ 
   printf("%d: %lld",i, values[i]); 
}

What I am getting now:

0: 21

1: 27

...

255: 19

What I want:

0: 4

1: 2

...

65535: 3
Roberto Caboni
  • 7,252
  • 10
  • 25
  • 39
TostaMista
  • 15
  • 1
  • 1

1 Answers1

0

At the outset, let me correct what you have said. As of now you are not printing the frequencies of 2 byte range. In general unsigned char is 1 byte (8 bits) and the results you are getting are also in accordance with what I said 8 bits => 0 <-> 2^8 -1 => 0 <-> 255

For getting the frequencies of 16 bits range you can use u_int16_t, Code goes something like this

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

int main () {
    FILE* fp = NULL;

    /* Open file and setup fp */

    int *freq = (int*) calloc(65536, sizeof(int));

    u_int16_t value;

    for ( ; ; ) {
        if (read(fileno(fp), &value, sizeof(value)) < sizeof(value)) {
            /* Assuming partial reads wont happen, EOF reached or data remaining is less than 2 bytes */
            break;
        }

        freq[value] = freq[value] + 1;
    }

    for (int i = 0; i < 65536 ; i++) {
        printf("%d : %d\n", i, freq[i]);
    }

    return 0;
}
m0hithreddy
  • 1,752
  • 1
  • 10
  • 17
  • 1
    Thanks for clarification, i will test it tomorrow – TostaMista May 19 '20 at 23:15
  • Thanks its working, had to change u_int16_t to __uint16_t – TostaMista May 20 '20 at 09:41
  • But pls be aware of the endianess of the machine. If you are expecting 0x0001 from your file to be 1 then the above code guarantees it to be one for big endian systems , but if your system is little endian your system interprets file data 0x0001 as 256. you must use htonl() function of netinet/in.h to convert it into big endian https://stackoverflow.com/questions/12791864/c-program-to-check-little-vs-big-endian check this link for more details – m0hithreddy May 20 '20 at 10:32