0

I want to write persian (Farsi) text on an image in C++, preferably in OpenCV.

I tried cv::putText which leads to writing ????...? instead the text. Also I tested cv::addText but I got not implemented exception. Additional I also checked opencv docs here which seems problematic. When I install opencv 4.8 via "VC package manager" there is no freetype class on it.

Finally I find a solution using this link. It writes persian strings 1 by 1 (characters are divided and are in reverse order).

#include "opencv2/opencv.hpp"
#include "../UTL/Util.h"

#include "ft2build.h"
#include FT_FREETYPE_H
FT_Library  library;
FT_Face     face;

using namespace cv;
using namespace std;
//-----------------------------------------------------------------------
void my_draw_bitmap(Mat& img, FT_Bitmap* bitmap, int x, int y, Scalar color)
{
    Scalar src_col, dst_col;
    for (int i = 0; i < bitmap->rows; i++)
    {
        for (int j = 0; j < bitmap->width; j++)
        {
            unsigned char val = bitmap->buffer[j + i * bitmap->pitch];
            float mix = (float)val / 255.0;
            if (val != 0)
            {
                src_col = Scalar(img.at<Vec3b>(i + y, j + x));
                dst_col = mix * color + (1.0 - mix) * src_col;
                img.at<Vec3b>(i + y, j + x) = Vec3b(dst_col[0], dst_col[1], dst_col[2]);
            }
        }
    }
}
//-----------------------------------------------------------------------
float PrintString(Mat& img, std::wstring str, int x, int y, Scalar color)
{
    FT_Bool       use_kerning = 0;
    FT_UInt       previous = 0;
    use_kerning = FT_HAS_KERNING(face);
    float prev_yadv = 0;
    float posx = 0;
    float posy = 0;
    float dx = 0;
    for (int k = 0; k < str.length(); k++)
    {
        int glyph_index = FT_Get_Char_Index(face, str.c_str()[k]);
        FT_GlyphSlot  slot = face->glyph;  // a small shortcut 
        if (k > 0) { dx = slot->advance.x / 64; }
        FT_Load_Glyph(face, glyph_index, FT_LOAD_DEFAULT);
        FT_Render_Glyph(slot, FT_RENDER_MODE_NORMAL);
        prev_yadv = slot->metrics.vertAdvance / 64;
        if (use_kerning && previous && glyph_index)
        {
            FT_Vector  delta;
            FT_Get_Kerning(face, previous, glyph_index, FT_KERNING_DEFAULT, &delta);
            posx += (delta.x / 64);
        }
        posx += (dx);
        my_draw_bitmap(img, &slot->bitmap, posx + x + slot->bitmap_left, y - slot->bitmap_top + posy, color);
        previous = glyph_index;
    }
    return prev_yadv;
}
//-----------------------------------------------------------------------
void PrintText(Mat& img, std::wstring str, int x, int y, Scalar color)
{
    float posy = 0;
    for (int pos = str.find_first_of(L'\n'); pos != wstring::npos; pos = str.find_first_of(L'\n'))
    {
        std::wstring substr = str.substr(0, pos);
        str.erase(0, pos + 1);
        posy += PrintString(img, substr, x, y + posy, color);
    }
    PrintString(img, str, x, y + posy, color);
}
//-----------------------------------------------------------------------
int main(int argc, char* argv[])
{
    FT_Init_FreeType(&library);
    auto path_to_font_file = "arial.ttf";
    FT_New_Face(library, path_to_font_file, 0, &face);
    FT_Set_Pixel_Sizes(face, 36, 0);
    FT_Select_Charmap(face, FT_Encoding::FT_ENCODING_UNICODE);
    
    Mat im(100, 300, CV_8UC3, Scalar(0,0,0));
    wstring str = L"خلیج فارس";

    PrintText(im, str, 100, 50, Scalar(255, 255, 255));
    cv::imshow("win", im);
    cv::waitKey(0);
    return 0;
}

but the result is:

bad result

which should be:

enter image description here

how can fix the problem in the above? any other approach will be appreciated.

Christoph Rackwitz
  • 11,317
  • 4
  • 27
  • 36
Babak.Abad
  • 2,839
  • 10
  • 40
  • 74
  • Freetype alone won't do this. See [this](https://stackoverflow.com/questions/49110006/arabic-joined-up-text-in-freetype) for an explanation why, and for a direction to go. – n. m. could be an AI Aug 19 '23 at 22:55

2 Answers2

1

Improving on my previous answer. Since you appear to be on Windows, my highest recommandation is to leverage the Win32 libraries to draw that string for you to an in-memory bitmap. Then transfer over the 32-bit ARGB values of that Bitmap to your OpenCV surface.

Here's a code sample that blits the text string to a Gdiplus::Bitmap object and saves it to a PNG file. Your code wouldn't likely save it to PNG, but instead just use the Bitmap object to get at the raw bytes and blit that onto your surface.

Here's the code sample:

int main()
{   
    Gdiplus::GdiplusStartupInput gdiplusStartupInput = {};
    ULONG_PTR gdiplusToken = {};
    GdiplusStartup(&gdiplusToken, &gdiplusStartupInput, NULL);

    Gdiplus::Bitmap bmp(300, 100, PixelFormat32bppARGB);
    Gdiplus::Graphics graphics(&bmp);
    graphics.Clear(Gdiplus::Color(0, 0, 0, 0)); // transparent

    Gdiplus::SolidBrush brush(Gdiplus::Color(255, 255, 255, 255)); // white
    Gdiplus::PointF origin(0, 0);

    Gdiplus::Font font(L"Arial", 50);

    const wchar_t* pwsz = L"خلیج فارس";
    graphics.DrawString(pwsz, -1, &font, origin, &brush);

    GUID guidBmp = {};
    GetGdiplusEncoderClsid(L"image/png", &guidBmp);

    bmp.Save(L"D:/output.png", &guidBmp);

    return 0;
}

Here's what it produces. The black background is what I artificially added as an extra layer in Paint.net. The real background is transparent. You can easily tweak the foreground and background colors above as well:

enter image description here

I hardcoded the font sizes. The most likely Win32 function you'd need in addition to the above is to call DrawText with the DT_CALCRECT flag to pre-determine the width and height such that the Bitmap is sized exactly to hold your string.

The rest of the code includes the helper function that I leveraged from a previous answer and the required header files.

#include <windows.h>
#include <gdiplus.h>
#include <iostream>
#include <vector>

HRESULT GetGdiplusEncoderClsid(const std::wstring& format, GUID* pGuid)
{
    HRESULT hr = S_OK;
    UINT  nEncoders = 0;          // number of image encoders
    UINT  nSize = 0;              // size of the image encoder array in bytes
    std::vector<BYTE> spData;
    Gdiplus::ImageCodecInfo* pImageCodecInfo = NULL;
    Gdiplus::Status status;
    bool found = false;

    if (format.empty() || !pGuid)
    {
        hr = E_INVALIDARG;
    }

    if (SUCCEEDED(hr))
    {
        *pGuid = GUID_NULL;
        status = Gdiplus::GetImageEncodersSize(&nEncoders, &nSize);

        if ((status != Gdiplus::Ok) || (nSize == 0))
        {
            hr = E_FAIL;
        }
    }

    if (SUCCEEDED(hr))
    {

        spData.resize(nSize);
        pImageCodecInfo = (Gdiplus::ImageCodecInfo*)&spData.front();
        status = Gdiplus::GetImageEncoders(nEncoders, nSize, pImageCodecInfo);

        if (status != Gdiplus::Ok)
        {
            hr = E_FAIL;
        }
    }

    if (SUCCEEDED(hr))
    {
        for (UINT j = 0; j < nEncoders && !found; j++)
        {
            if (pImageCodecInfo[j].MimeType == format)
            {
                *pGuid = pImageCodecInfo[j].Clsid;
                found = true;
            }
        }

        hr = found ? S_OK : E_FAIL;
    }

    return hr;
}
selbie
  • 100,020
  • 15
  • 103
  • 173
  • it works perfectly, however need to add this piece of code before the `main`: #pragma comment( lib, "Gdiplus.lib" ) – Babak.Abad Aug 20 '23 at 17:28
0

TLDR: A right to left string is getting printed left to right.

Persian is a right-left (RTL) locale. Inspecting L"خلیج فارس" in the debugger

str[0] = خ
str[1] = ل
str[2] = ی
str[3] = ج
str[4] = <space>
str[5] = ف
str[6] = ا
str[7] = ر
str[8] = س
str[9] = <nul>

The characters are stored in reading order with that خ character stored first, but intended to be rendered on the far right. Similarly, the س character is stored last and intended to be printed on the far left.

So if you look real closely to the characters above, you can see they are essentially getting printed LTR (left to right), when they should be getting printed RTL (right to left).

Your code is printing the characters in storage order left to right increasing the x position on the LTR (left to right) canvas.

my_draw_bitmap(img, &slot->bitmap, posx + x + slot->bitmap_left, y - slot->bitmap_top + posy, color);

A quick hack would be to reverse the string before passing it to PrintString I have no idea the implications of that with regards to other font/kerning stuff for RTL languages. And it gets messy really quick if you have a string with both LTR and RTL characters within them.

A smarter way would be to have PrintString scan the string looking for blocks of RTL characters (that are in UTF-16 range of 0x0590 to 0x08FF) and then reverse print individual segments, possibly with kerning and padding to be on the left instead of the right. There's entire libraries of code out there like Harfbuzz that exist to solve this problem and it wouldn't surprise me if there are more nuances to rendering that what I've described.

On Windows, you could probably use GDI+ to correctly blit to the string to a Bitmap object with DrawText, then transfer that to CV to render.

selbie
  • 100,020
  • 15
  • 103
  • 173
  • "it wouldn't surprise me if there are more nuances" Yes there are nuances with rendering the Arabic script. Which is another way to say that simply switching the order of characters won't work at all. – n. m. could be an AI Aug 19 '23 at 23:27