I'm working on a project for a class and I could use some guidance. I need to parse a character array into constituent parts - the specifications of which I am given - but I am unsure how to do so in C.
I have been given a file and each page of the file is read into a buffer as a character array like so:
typedef struct page_t {
char reserved[PAGESIZE];
} page_t;
I have been given the following specifications about the pages read:
- For each page it starts with a 2 byte gap offset, followed key-value records, a gap at the indicated offset, and lastly an 8 byte address at the end pointing to the next page
- The key-value records are of the following form: 8 byte unsigned integer key followed by a value where the first 4 bytes are an unsigned integer inidicating the length of the string part of the value and a string of variable length (it will be the length indicated in the 4 bytes previously mentioned so the total length of the value portion will be length+4)
- There can be multiple key-value records in the file but the sum of all key-value records will not exceed 4086 bytes and the gap is always at the end of the file prior to the address of the next page
Since I have not been given anymore explanation about format of the page read in and I need to parse through the char array I was wondering if I could do something like use the strtoul
function to read the 8 bytes of the array at a time to find the correct key (and to skip over the key's values if they are not the key I am trying to match). I asked my TA about it and the answer I got was:
You can use functions that convert character (byte) arrays to numbers. Consider making a toy example program that converts a structure to a character array and back to see if scan/atoi/strtoll... have the expected behavior. If the functions do not work you can also consider reading iteratively. You may find them useful to extract the key/value size. The value as a string should work!
So I tried making a short program that converted a struct to an array and back and tried using strtoul
on the string but I'm not sure that I'm doing it correctly.
So my tester program looks like this:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <stdint.h>
typedef struct record_test {
uint64_t key;
uint32_t val_size;
char value[255];
} record_test;
int main( int argc, char ** argv ) {
record_test record = {1234, 13, "asdfghjklqwer"};
char page[4096];
// print what is in record
printf("Here's the record itself:\n");
printf("key: %llu\n", record.key);
printf("val_size: %u\n", record.val_size);
printf("record: %s\n", record.value);
memcpy(page, &record, sizeof(record_test));
// print what is in page
printf("Here's what's in the page:\n");
printf("page: %s\n", page);
// check page contents with pointer
record_test* revert;
revert = (record_test*)page;
printf("Here's the reverted record using pointers:\n");
printf("key: %llu\n", revert->key);
printf("val_size: %u\n", revert->val_size);
printf("record: %s\n", revert->value);
// reading what is in page using strtoul
char* endKey;
char* value;
printf("reading using strtoul:\n");
printf("key: %lu\n", strtoul(page, &endKey, 8));
printf("val size: %d\n", (int)strtoul(endKey, &value, 4));
printf("value: %s\n", value);
}
And these are the results I'm getting from it when I use printf to follow it:
Here's the record itself:
key: 1234
val_size: 13
record: asdfghjklqwer
Here's what's in the page:
page: ?
Here's the reverted record using pointers:
key: 1234
val_size: 13
record: asdfghjklqwer
reading using strtoul:
key: 0
val size: 0
value: ?
So based on the pointer that I used to recast the struct, the character array does have the right information in it but for whatever reason the character array itself is showing ?
when I try to print it and similarly the printf statements showing what strtoul
is reading is showing 0
for the integers. I'm not sure what's going on here, why am I getting ?
when that character isn't even in the value string?Can someone tell me where I am going wrong or if I can even use this function at all? Should I be trying to iterate though the character array using bitwise operations to read it instead?
Any help would be great! Thank you!