Changing a c code to work line by line

Question

#include <mhash.h>
#include <stdio.h>
#include <stdlib.h>

int main(void)
 {
        int i;
        MHASH td;
        unsigned char buffer;
        unsigned char *hash;

        td = mhash_init(MHASH_WHIRLPOOL);

        if (td == MHASH_FAILED) exit(1);

        while (fread(&buffer, 1, 1, stdin) == 1) {
             mhash(td, &buffer, 1);
        }

        hash = mhash_end(td);

        for (i = 0; i < mhash_get_block_size(MHASH_WHIRLPOOL); i++) {
            printf("%.2x", hash[i]);
        }
        printf("\n");

        exit(0);
 }

Hi, I have above code from the mhash example page. I need to change it, so It will keep reading from stdin, and calculate the hash line by line, instead of waiting for EOF

cat textfile | whirlpool_line_hash

My understanding is that I keep the while loop (which waits for the EOF) and make the hash calculation and print after I received a 10 (0x0a). After the print mhash needs to be reset, right? I am not into C at all, but I need a fast program, so I want to do it in C. I already fail at comparing the pointer to an integer ;-) Can someone please help?

what about [this](http://cboard.cprogramming.com/c-programming/108383-using-fread-stdin.html) - there is some technique that allows you to check against \n .. you could put a link for mhash making it easier for others answering.. — nayana, May 09 '16 at 10:23

score 1 · Answer 1 · answered May 09 '16 at 10:25

got it done ;-)

#include <mhash.h> //mhash
#include <stdio.h>
#include <stdlib.h> //exit
#include <unistd.h> //getopt


int main(void)
 {

        int i;
        MHASH td;
        unsigned char buffer;
        unsigned char *hash;

        td = mhash_init(MHASH_WHIRLPOOL);

        if (td == MHASH_FAILED) exit(1);

        while (fread(&buffer, 1, 1, stdin) == 1) { // read from stdin until receive EOF

        if ( buffer != '\n' ) { mhash(td, &buffer, 1); } //dont calculate line break
        if ( buffer == '\n' ) {  //received line break
        hash = mhash_end(td);

        for (i = 0; i < mhash_get_block_size(MHASH_WHIRLPOOL); i++) { printf("%.2x", hash[i]); }
        printf("\n");
        td = mhash_init(MHASH_WHIRLPOOL);

}
}
        exit(0);
 }

score 1 · Accepted Answer · answered May 09 '16 at 10:30

1

It works except one little caveat, the hash buffer returned by mhash_end is a dynamically allocated buffer, so it's better to free it when you are done using it:

free(hash);

You can use fgets which will read one line at the most. In terms of performance, read and feed into hash a char each time probably is not the best thing to do, instead, you can read an entire line and feed the block into hash function for update. Try this:

char line[4096];
int len;
while (fgets(line, sizeof line, stdin) != NULL) { // read a line each time
    len = strlen(line);
    char *p = strrchr(line, '\n');
    if (p != NULL)
        mhash(td, line, len - 1);  // strip the new line
    else
        mhash(td, line, len);
}

hash = mhash_end(td);
for (i = 0; i < mhash_get_block_size(MHASH_WHIRLPOOL); i++) {
    printf("%.2x", hash[i]);
}
free(hash);

answered May 09 '16 at 10:30

fluter

13,238
8
62
100

Hi I did some tests today vs a 18 million password list. the fgets takes twice as long as the fread variant, using md5 as mhash hash . Strange.... And yet strangest is that the original perl implementation I was using is even twice as fast as the faster fread-mhash combination. – Sebastian Heyn May 09 '16 at 18:48
@SebastianHeyn The `fread` does not take new line character into account while `fgets` does, so do you need strip the new lines? if not, this could be changed to perform better. – fluter May 09 '16 at 23:44
the new line is used to seperate each password from the next one, so I guess I need to take it into account. I have also tried to make the c program read directly from file, instead of using cat to pipe the data. it is just as slow. – Sebastian Heyn May 10 '16 at 09:07
@SebastianHeyn if new line does not need to be striped, you can try to read in a block each time, like `while(fread(line, 1, sizeof line, stdin) > 0) { ... }`, I think this will be much faster, `fgets` is slow because it stops at new line. – fluter May 10 '16 at 09:11

Changing a c code to work line by line

2 Answers2