8

I am trying to plot an histogram of most frequent words written in arabic, but I can't figure out a way to do that. All I can get is the sliced characters but not the compiled word.

Here is an example of what I get :

enter image description here

import seaborn as sns

import pandas as pd

res = {
 'الذكاء': 8,
 'الاصطناعي': 9,
 'هو': 2,
 'سلوك': 1,
 'وخصائص': 1,
 'معينة': 1,
 'تتسم': 1
}

df = pd.DataFrame(res.items(), columns=['word', 'count'])

sns.set(style="whitegrid")
ax = sns.barplot(x="count", y="word", data=df)

As shown in the image above, I am expecting to get those characters compiled, like they're mentioned in the dictionary.

saul
  • 299
  • 1
  • 2
  • 10
  • Did you try [this](https://stackoverflow.com/questions/15421746/matplotlib-writing-right-to-left-text-hebrew-arabic-etc) and [this (similar)](https://stackoverflow.com/questions/18772950/right-to-left-support-in-python-networkx-and-matplotlib)? – Sheldore May 24 '19 at 12:42
  • @Sheldore I have seen this answer before but couldn't find a way to integrate it in my problem. He's plotting a full text there, in my case it's an histogram that would have arabic labels – saul May 24 '19 at 12:45
  • Yes, but you can use the same way to generate a list of compatible tick labels using `bidi` package – Sheldore May 24 '19 at 12:49
  • I can't run your code because something weird happens with my cursor controls when I try to modify your dictionary `res`. Things start to type from right to left – Sheldore May 24 '19 at 12:51
  • @Sheldore I have just tried to create a new dictionary and append to it the same keys applying on them the `arabic_reshaper.reshape` function but with no further result still got the same output. – saul May 24 '19 at 13:00
  • @saul have you had a chance to test [my answer](https://stackoverflow.com/a/68221670/16343464)? – mozway Aug 04 '21 at 13:30

1 Answers1

3

This seems to run well with arabic_reshaper and bidi as pointed out by @Sheldore.

import seaborn as sns
import pandas as pd
import arabic_reshaper
from bidi.algorithm import get_display

res = {
 'الذكاء': 8,
 'الاصطناعي': 9,
 'هو': 2,
 'سلوك': 1,
 'وخصائص': 1,
 'معينة': 1,
 'تتسم': 1
}

res2 = {get_display(arabic_reshaper.reshape(k)): v for k,v in res.items()}

df = pd.DataFrame(res2.items(), columns=['word', 'count'])

sns.set(style="whitegrid")
ax = sns.barplot(x="count", y="word", data=df)

barplot with arabic labels

mozway
  • 194,879
  • 13
  • 39
  • 75
  • 2
    Don't forget to install the packages using `pip install arabic-reshaper python-bidi` – Amr Keleg Apr 05 '22 at 08:26
  • arabic-reshaper and python-bidi are hacks. python-bidi attempts to reorder text to visual rather than logical order, and arabic-reshaper swaps the Arabic codepoint for deprecated Arabic presentational forms. A more robust approach is to change the matplotlib backend to mplcairo which uses raqm (which in turn uses fribidi (or sheen) and harfbuzz) for complext text rendering and bidirectional algotirthm. – Andj Mar 06 '23 at 10:26