2

I want to dump out some information from a parsed ttf file into an XML file. There are several tables in ttf, e.g. cmap, head, hhea. I've defined the structures of those table, for example:

class Font_Header{
public:
    FIXED   table_version_number;
    FIXED   font_revision;
    ULONG   checksum_adjustment;
// some other field...
    SHORT   index_to_loc_format;
    SHORT   glygh_data_format;
// some member functions...
};

Now I want to write a function named dump_info to dump out the memory layout of this structure.

void Font_Header::dump_info(FILE *fp, size_t indent){
    INDENT(fp, indent); fprintf(fp, "<head>\n");
    ++indent;
    INDENT(fp, indent); fprintf(fp, "<tableVersion value=\"0x%08x\"/>\n", table_version_number);
// some other lines...
    INDENT(fp, indent); fprintf(fp, "<glyphDataFormat value=\"%d\"/>\n", glygh_data_format);
    --indent;
    INDENT(fp, indent); fprintf(fp, "</head>\n");
}

My questions are:

  1. Is there better solution to achieve this goal? I've written N lines to define the structure and now I have to write another N lines to dump_info. This is not cool. Something I desire is like:

    foreach field in fields
        dump(indent);
        dumpLn("<$1 value=\"$2\">", field.name, field.value);
        // Fields of different type are dumped in different format!
    end
    
  2. How to accomplish indentation wisely? I defined the following macro

    #define INDENT(fp, indent) for(size_t i = 0; i < (indent); ++i) fprintf((fp), "\t")
    

and append this macro to each line... I wonder if there is an elegant way to finish this task.

lzl124631x
  • 4,485
  • 2
  • 30
  • 49

2 Answers2

0

The following code is my attempt. I stored the information of fields of class into an Notation array. I assume ULL, i.e. unsigned long long, to be the type consuming the largest memory (8B), and always get 8B from the pointer of each field. One thing should be treated with special caution is that I should always make sure I assigned correctly the format in Notation. For example, if I want to print a short value (I only need 2B) but I assigned format with "%d"(fetch 4B for me), I will get a wrong answer.

Two problems are still still haunting me:

  1. The method is not portable. Different platform might assign ULL with different size of memory.
  2. Since I always get 8B from memory, putting aside efficiency, this might cause read memory violation.

Well, update... I add mask to Notation for getting the right value even if the format is not properly assigned.

See the OS/2 table in ttf. It is composed of almost 40 fields. Now I've written 40 lines to define the table, 40 lines to read the table and 40 lines to dump info of it! Oh, god. Maybe in the future I've to add some other 40 lines, 40 lines and 40 lines... Kill me if I cannot automate this task.


#include <stdio.h>

class X{
public:
  char      a;
  short     b;
  int       c;
  double    d;
  X(char a, short b, int c, double d) : a(a), b(b), c(c), d(d) {}
};

typedef unsigned long long ULL;
#define FIELD(c, x)             (((c*)0)->x)
#define OFFSET(c, x)            ((size_t)&FIELD(c, x))
#define SIZE(c, x)              (sizeof(FIELD(c, x)))
#define MASK(c, x)              (((ULL)~0) >> ((sizeof(ULL) - SIZE(c, x)) << 3))
#define NOTATION(c, x, s)       { SIZE(c, x), OFFSET(c, x), #x, s, MASK(c, x) }
#define PTR(c, f)               ((void*)((size_t)&c + f->offset))
#define VALUE(c, f)             (f->mask & *(ULL*)PTR(x, f))
struct Notation{
  size_t        size;
  size_t        offset;
  const char    *name;
  const char    *format;
  ULL           mask;
};

Notation X_field[] = {
  NOTATION(X, a, "%c"),
  NOTATION(X, b, "%d"),  // The right 'format' of short should be %hd. I intentionally set it wrong.
  NOTATION(X, c, "%d"),
  NOTATION(X, d, "%lf")
};

#define PRINT(x, f, s) \
  printf("name: %s, size: %u, ptr: %p, value: " s "\n", #f, sizeof(x.f), &x.f, x.f)

int main(){
  X x('z', 3, 2, 1.5);
  printf("--------------------MANUAL--------------------\n");
  PRINT(x, a, "%c");
  PRINT(x, b, "%hd");
  PRINT(x, c, "%d");
  PRINT(x, d, "%lf");
  printf("--------------------MASK--------------------\n");
  printf("0x%016hhx, %hhu\n", (char)~0, (char)~0);
  printf("0x%016hx, %hu\n", (short)~0, (short)~0);
  printf("0x%016x, %u\n", (int)~0, (int)~0);
  printf("0x%016llx, %llu\n", (ULL)~0, (ULL)~0);
  printf("--------------------FIELD--------------------\n");
  Notation *field = NULL;
  int i = 0;
  for(i = 0, field = X_field; i < 4; ++i, ++field){
    printf("size: %u, offset: %u, name: %s, format: %s, mask: 0x%016llx\n",
      field->size, field->offset, field->name, field->format, field->mask);
  }

  printf("--------------------AUTO--------------------\n");
  for(i = 0, field = X_field; i < 4; ++i, ++field){
    printf("name: %s, size: %u, ptr: %p, value: ", field->name, field->size, PTR(x, field));
    printf(field->format, VALUE(x, field));
    printf("\n");
  }
  return 0;
}

Output:

--------------------MANUAL--------------------
name: a, size: 1, ptr: 0x22ac18, value: z
name: b, size: 2, ptr: 0x22ac1a, value: 3
name: c, size: 4, ptr: 0x22ac1c, value: 2
name: d, size: 8, ptr: 0x22ac20, value: 1.500000
--------------------MASK--------------------
0x00000000000000ff, 255
0x000000000000ffff, 65535
0x00000000ffffffff, 4294967295
0xffffffffffffffff, 18446744073709551615
--------------------FIELD--------------------
size: 1, offset: 0, name: a, format: %c, mask: 0x00000000000000ff
size: 2, offset: 2, name: b, format: %d, mask: 0x000000000000ffff
size: 4, offset: 4, name: c, format: %d, mask: 0x00000000ffffffff
size: 8, offset: 8, name: d, format: %lf, mask: 0xffffffffffffffff
--------------------AUTO--------------------
name: a, size: 1, ptr: 0x22ac18, value: z
name: b, size: 2, ptr: 0x22ac1a, value: 3
name: c, size: 4, ptr: 0x22ac1c, value: 2
name: d, size: 8, ptr: 0x22ac20, value: 1.500000
lzl124631x
  • 4,485
  • 2
  • 30
  • 49
0

If you're only doing a few cases, just do it by hand and live with the duplication.

If you're just doing this for a lot of simple structs, I'd consider a simple parser to convert the structs in given header files into source for dumping the structure.

If I wanted to be able to customize the XML output more (which seems likely), I'd generate both the struct and the struct printing function from some metadata file.

If you're not doing this for just simple structs, then I'd wrap my data portion in a simple struct and do as above.

For example you could use something like this: (I'd probably use XML or JSON metadata as can be easier to parse - depending on what language you want to write your generator in)

FontHeaderData.crazymeta

struct_name : Font_Header_data
dumper_name : Font_Header_data_to_xml
xml_root_node : head
FIXED table_version_number tableVersionNumber 0x%08x
...

Then your main file would look like this:

Font_Header.h

#include "Font_Header_data.h"
#include "Font_Header_data_to_xml.h"

class Font_Header {
    Font_Header_data data;
    void dump_info(FILE *fp, size_t indent){
        Font_Header_data_to_xml(fp,indent);
    }
};

Where Font_Header_data.h, Font_Header_data_to_xml.h and Font_Header_data_to_xml.c were generated by your parser.

Remember to hook your generation of the files into your build process, with the correct dependencies, so they get rebuilt at the right times.

The "fun" part is writing the metadata to .c and .h converter.

In the past I've done something similar for keeping large numbers of MySQL tables insync with their C++ counterparts, and generated the right C API INSERT/UPDATE commands to bridge between the two. While it was a lot of work getting it right to start with - it certainly saved me many hours of pain later.

Michael Anderson
  • 70,661
  • 7
  • 134
  • 187