10

Here are already questions:

I have many different fonts. Many of them are "ascii only", and i need to check what fonts contains several accented characters. (latin - unicode codepoints - texts are encoded as utf8) like: (áäčďéěíĺľňóôöőŕřšťúůüűýž)

Have mainly:

  • TrueType fonts (with the extension .ttf)
  • TrueType collections (extension .ttc)
  • OpenType fonts (.otf)

What is the usual (correct) way to do this with perl? (it is the only language what i know a bit and the above questions are for C). Asking before I start install all CPAN modules what contains "font":).

I'm on OS X (if this is matters, and can install any macports package - if it helps).

Community
  • 1
  • 1
cajwine
  • 3,100
  • 1
  • 20
  • 41
  • 1
    Have a look at my answer to [this question](http://stackoverflow.com/questions/15896493/how-can-one-find-the-unicode-codepoints-that-a-font-has-glyphs-for-on-a-debian/15905540#15905540). – nwellnhof Jun 03 '13 at 22:20
  • @nwellnhof Unfortunately, i'm unable install `Font::FreeType` on OS X. Compiling the FreeType.xs throwing error. Fortunately the @mob 's suggetion `Font::TTF` installed cleanly. Thank you anyway, good to know, than here is another solution. – cajwine Jun 08 '13 at 15:35
  • Not perl, but for SEO sake, this python script works great : http://unix.stackexchange.com/a/268286/26952 – Skippy le Grand Gourou Feb 07 '17 at 13:27

1 Answers1

3

For .ttf files, you can use Font::TTF and related modules:

use Font::TTF::Font;
my $font = Font::TTF::Font->open( "C:/Windows/Fonts/ariali.ttf" );
my @supported_codepoints = sort { $a <=> $b } $font->{cmap}->reverse;

I'm getting out of my depth, but there's also a Font::TTF::Ttc module in the Font::TTF distribution that you could poke around in and see if you can extract more information about supported code points.

(Font::TTF suggestion came from here)

Community
  • 1
  • 1
mob
  • 117,087
  • 18
  • 149
  • 283
  • `Font::TTF` is a nice module, but it requires some understanding of the TrueType and OpenType formats. BTW, `$font->{cmap}->ms_table` returns a hash that maps Unicode code points to glyph IDs. This hash should be easier to use than the array returned by `->reverse`. – nwellnhof Jun 04 '13 at 16:16