1

Consider the following code that loads a dataset of records into a buffer and creates a Record object for each record. A record constitutes one or more columns and this information is uncovered at run-time. However, in this particular example, I have set the number of columns to 3.

typedef unsigned int uint;

typedef struct
{
        uint *data;

} Record;

Record *createNewRecord (short num_cols);

int main(int argc, char *argv[])
{
        time_t start_time, end_time;
        int num_cols = 3;
        char *relation;
        FILE *stream;
        int offset;

        char *filename = "file.txt";
        stream = fopen(filename, "r");
        fseek(stream, 0, SEEK_END);
        long fsize = ftell(stream);
        fseek(stream, 0, SEEK_SET);

        if(!(relation = (char*) malloc(sizeof(char) * (fsize + 1))))
        printf((char*)"Could not allocate buffer");

        fread(relation, sizeof(char), fsize, stream);
        relation[fsize] = '\0';
        fclose(stream);

        char *start_ptr = relation;
        char *end_ptr = (relation + fsize);

        while (start_ptr < end_ptr)
        {
                Record *new_record = createNewRecord(num_cols);

                for(short i = 0; i < num_cols; i++)
                {
                        sscanf(start_ptr, " %u %n",
                        &(new_record->data[i]), &offset);

                        start_ptr += offset;
                }
}

Record *createNewRecord (short num_cols)
{
        Record *r;

        if(!(r       = (Record *) malloc(sizeof(Record)))    ||
           !(r->data = (uint *) malloc(sizeof(uint) * num_cols)))
        {
                printf(("Failed to create new a record\n");
        }

        return r;
}

This code is highly inefficient. My dataset contains around 31 million records (~1 GB) and this code processes only ~200 records per minute. The reason I load the dataset into a buffer is because I'll later have multiple threads process the records in this buffer and hence I want to avoid files accesses. Moreover, I have a 48 GB RAM, so the dataset in memory should not be a problem. Any ideas on how can to speed things up??

SOLUTION: the sscanf function was actually extremely slow and inefficient.. When I switched to strtoul, the job finishes in less than a minute. Malloc-ing ~ 3 million structs of type Record took only few seconds.

NewToAndroid
  • 581
  • 7
  • 25
  • 1
    When you profiled it, what was the slowest section of code? – indiv Sep 24 '14 at 20:41
  • Reading the dataset took only a 1 or 2 seconds. The rest of the time was all spent in the while loop. Also, the process never finished, i always had to kill it. – NewToAndroid Sep 24 '14 at 20:44
  • Platform & Compiler? – IdeaHat Sep 24 '14 at 20:47
  • Also, where doesn new_record end up? How is it stored? – IdeaHat Sep 24 '14 at 20:49
  • 1
    Try testing if `sscanf` returned 1, that is it properly read an unsigned int. Oh, and remove blanks (spaces) from the format string - scanf function should automatically skip leading blanks before the number to be read and stop on blank after it, so you don't need to tell it where the blank chars are. – CiaPan Sep 24 '14 at 20:54
  • 1
    The code seems to lack one closing brace... And even after adding the missing brace, it would crash in case of the first `malloc` failure. – CiaPan Sep 24 '14 at 20:57
  • did you try mmap()? It can improve file read speed significantly. also if num_cols are constant try reading into(sscanf/read} an array of num_cols directly. – askmish Sep 24 '14 at 21:00

2 Answers2

1

Confident that a lurking non-numeric data exist in the file.

int offset;
...
sscanf(start_ptr, " %u %n", &(new_record->data[i]), &offset);
start_ptr += offset;

Notice that if the file begins with non-numeric input, offset is never set and if it had the value of 0, start_ptr += offset; would never increment.

If a non-numeric data exist later in the file like "3x", offset will get the value of 1, and cause the while loop to proceed slowly for it will never get an updated value.

Best to check results of fread(), ftell() and sscanf() for unexpected return values and act accordingly.

Further: long fsizemay be too small a size. Look to using fgetpos() and fsetpos().

Note: to save processing time, consider using strtoul() as it is certainly faster than sscanf(" %u %n"). Again - check for errant results.

BTW: If code needs to uses sscanf(), use sscanf("%u%n"), a tad faster and for your code and the same functionality.

chux - Reinstate Monica
  • 143,097
  • 13
  • 135
  • 256
0

I'm not an optimization professional but I think some tips should help.

First of all, I suggest you use filename and num_cols as macros because they tend to be faster as literals when I don't see you changing their values in code.

Seond, using a struct for storing only one member is generally not recommended, but if you want to use it with functions you should only pass pointers. Since I see you're using malloc to store a struct and again for storing the only member then I suppose that is the reason why it is too slow. You're using twice the memory you need. This might not be the case with some compilers, however. Practically, using a struct with only one member is pointless. If you want to ensure that the integer you get (in your case) is specifically a record, you can typedef it.

You should also make end_pointer and fsize const for some optimization.

Now, as for functionality, have a look at memory mapping io.

Community
  • 1
  • 1
Amr Ayman
  • 1,129
  • 1
  • 8
  • 24
  • I see your point about malloc, but the malloc was certainly not the problem. Around 80 million malloc of the struct took no more than 2 to 3 seconds. – NewToAndroid Sep 30 '14 at 18:51