0

background:

using fonttools i want to change a character like " ل "(U+0644) to it's initial form " ﻟ "(U+FEDF).i can do this in 4 steps:

  1. using fonttools, save font data as xml and then parse through it

    font = TTFont(fontPath) font.saveXML("tempfont.xml")

  2. find the name associated with U+0644 in the cmap table(suppose the name is "isolam")

  3. in the GSUB table find the table for "init" and find the entry with the "in" attrubute of "isolam" and then reading it's "out" attribute( suppose it is "initlam")

  4. and finally search for the name "initlam" in cmap table and get the code-point

this process is very slow and I think it is because the xml file is written on hard and then is read from there and also there is lots of iterating through the xml file.

question:

instead of saving xml file, i am now trying to work with the TTFont object directly. but i have problem reading code-points from cmap.

font = TTFont(fontPath)
cmap = font['cmap'].tables

# there are 3 cmap tables for different platform in the font i am using, but
# for now i'm using cmap[2] which has platformId = 3 and is for windows.
print(cmap[2].data)

but the result seems gibberish.it is very long so i just show some of it:

b'\x00`\x00@\x00\x05\x00\x00!\x00+\x00/\x009\x00:\x00>\x00[\x00]\x00{\x00}\x00\xab\x00\xbb\

now i expected it to return a dictionary with code-points as keys and names as values, or maybe a list of tuples.

so how can i access cmap data in an understandable format?

or how can i get the name of a glyph, given the associated code-point and vice versa?

HKhoshdel
  • 33
  • 1
  • 7
  • It's written in byte code. You can use str(cmap[2].data) to get readable text. – Alex K. Aug 27 '18 at 10:33
  • unfortunately that does not change anything. it just changes the type of the variable. maybe you meant `chr(cmap[2].data)` and for some reason that gives an error saying it needs int but got byte. but if i write: `for i in cmap[2].data: print(i)` it gets a little better and you can see some characters, but still does not make any sense. – HKhoshdel Aug 27 '18 at 11:02

1 Answers1

0

To get a mapping of the actual character to the name in the cmap table, you can do something like this:

font = TTFont(fontPath)
ch_to_name = {} # key will be the codepoint in hex, value will be name

cmap = font["cmap"]
for ch, name in cmap.getBestCmap().items():
    ch_to_name["{:04X}".format(ch)] = name
Jaymon
  • 5,363
  • 3
  • 34
  • 34