2

I am following the documentation on apple.com.

I managed to get The 'cmap' encoding subtables. I know 100% that platformID, platformSpecificID are correct, but offset is suspicious. Here is the data:

array(3) {
  [0]=>
  array(3) {
    ["platform_id"]=>
    int(0)
    ["specific_id"]=>
    int(3)
    ["offset"]=>
    int(532)
  }
  [1]=>
  array(3) {
    ["platform_id"]=>
    int(1)
    ["specific_id"]=>
    int(0)
    ["offset"]=>
    int(28)
  }
  [2]=>
  array(3) {
    ["platform_id"]=>
    int(3)
    ["specific_id"]=>
    int(1)
    ["offset"]=>
    int(532)
  }
}

Offset for two tables is the same, 532. Can anyone explain me this? And is this offset from current position or from the beginning of the file?

part 2

Ok. So I managed to get to the format tables using this:

private function parseCmapTable($table)
{
    $this->position         = $table['offset'];

    // http://developer.apple.com/fonts/ttrefman/RM06/Chap6cmap.html
    // General table information

    $data   = array
    (
        'version'           => $this->getUint16(),
        'number_subtables'  => $this->getUint16(),
    );

    $sub_tables = array();

    for($i = 0; $i < $data['number_subtables']; $i++)
    {

        // http://developer.apple.com/fonts/ttrefman/RM06/Chap6cmap.html
        // The 'cmap' encoding subtables

        $sub_tables[]   = array
        (
            'platform_id'       => $this->getUint16(),
            'specific_id'       => $this->getUint16(),
            'offset'            => $this->getUint32(),
        );

    }

    // http://developer.apple.com/fonts/ttrefman/RM06/Chap6cmap.html
    // The 'cmap' formats

    $formats                = array();

    foreach($sub_tables as $t)
    {
        // http://stackoverflow.com/questions/5322019/character-to-glyph-mapping-table/5322267#5322267

        $this->position = $table['offset'] + $t['offset'];

        $format = array
        (
            'format'                    => $this->getUint16(),
            'length'                    => $this->getUint16(),
            'language'                  => $this->getUint16(),
        );

        if($format['format'] == 4)
        {
            $format     += array
            (
                'seg_count_X2'                  => $this->getUint16(),
                'search_range'                  => $this->getUint16(),
                'entry_selector'                => $this->getUint16(),
                'range_shift'                   => $this->getUint16(),
                'end_code[segCount]'            => $this->getUint16(),
                'reserved_pad'                  => $this->getUint16(),
                'start_code[segCount]'          => $this->getUint16(),
                'id_delta[segCount]'            => $this->getUint16(),
                'id_range_offset[segCount]'     => $this->getUint16(),
                'glyph_index_array[variable]'   => $this->getUint16(),
            );

            $backup = $format;

            $format['seg_count_X2']     = $backup['seg_count_X2']*2;
            $format['search_range']     = 2 * (2 * floor(log($backup['seg_count_X2'], 2)));
            $format['entry_selector']   = log($backup['search_range']/2, 2);
            $format['range_shift']      = (2 * $backup['seg_count_X2']) - $backup['search_range'];
        }

        $formats[$t['offset']]  = $format;
    }       

    die(var_dump( $sub_tables, $formats ));

The output:

array(3) {
[0]=>
  array(3) {
    ["platform_id"]=>
    int(0)
    ["specific_id"]=>
    int(3)
    ["offset"]=>
    int(532)
  }
  [1]=>
  array(3) {
    ["platform_id"]=>
    int(1)
    ["specific_id"]=>
    int(0)
    ["offset"]=>
    int(28)
  }
  [2]=>
  array(3) {
    ["platform_id"]=>
    int(3)
    ["specific_id"]=>
    int(1)
    ["offset"]=>
    int(532)
  }
}
array(2) {
  [532]=>
  array(13) {
    ["format"]=>
    int(4)
    ["length"]=>
    int(658)
    ["language"]=>
    int(0)
    ["seg_count_X2"]=>
    int(192)
    ["search_range"]=>
    float(24)
    ["entry_selector"]=>
    float(5)
    ["range_shift"]=>
    int(128)
    ["end_code[segCount]"]=>
    int(48)
    ["reserved_pad"]=>
    int(58)
    ["start_code[segCount]"]=>
    int(64)
    ["id_delta[segCount]"]=>
    int(69)
    ["id_range_offset[segCount]"]=>
    int(70)
    ["glyph_index_array[variable]"]=>
    int(90)
  }
  [28]=>
  array(3) {
    ["format"]=>
    int(6)
    ["length"]=>
    int(504)
    ["language"]=>
    int(0)
  }
}

Now, how do I get from here, to getting character Unicode codes? I tried reading the documentation, but it is too vague for a novice.

http://developer.apple.com/fonts/ttrefman/RM06/Chap6cmap.html

Uyghur Lives Matter
  • 18,820
  • 42
  • 108
  • 144
Gajus
  • 69,002
  • 70
  • 275
  • 438

1 Answers1

2

The offset is from the beginning of the table. What your data is saying is that the Mac table (platformId 1) starts at offset 28, while the Unicode (platformId 0) and Windows (platformId 3) mappings share the same table that starts at byte offset 532.

Gabe
  • 84,912
  • 12
  • 139
  • 238
  • Thank you Gabe. You seem to know this stuff. Can you take a look at the part 2 of this question? – Gajus Mar 16 '11 at 12:29
  • @Guy: Rather than turning this question into a completely different one, please ask a second question and post links from each to the other. – Gabe Mar 16 '11 at 13:24