How would I remove the Arabic prefix "ال" from an arabic string?

Question

I have tried things like this, but there is no change between the input and output:

def remove_al(text):
    if text.startswith('ال'):
        text.replace('ال','')
    return text

Does this answer your question? [Removing first x characters from string?](https://stackoverflow.com/questions/11806559/removing-first-x-characters-from-string) — armiro, Apr 15 '20 at 06:32

score 2 · Answer 1 · answered Apr 15 '20 at 05:51

text.replace returns the updated string but doesn't change it, you should change the code to

text = text.replace(...)

Note that in Python strings are "immutable"; there's no way to change even a single character of a string; you can only create a new string with the value you want.

Ala Tarighati · Answer 2 · 2020-04-15T06:31:22.697

1

If you want to only remove the prefix ال and not all of ال combinations in the string, I'd rather suggest to use:

def remove_prefix_al(text):
    if text.startswith('ال'):
        return text[2:]
    return text

If you simply use text.replace('ال',''), this will replace all ال combinations:

Example

text = 'الاستقلال'
text.replace('ال','')

Output:

'استقل'

edited Apr 15 '20 at 06:31

answered Apr 15 '20 at 06:25

Ala Tarighati

3,507
5
17
34

score 0 · Answer 3 · answered Apr 15 '20 at 06:36

I would recommend the method str.lstrip instead of rolling your own in this case.

example text (alrashid) in Arabic: 'الرَشِيد'

text = 'الرَشِيد'
clean_text  = text.lstrip('ال')
print(clean_text)

Note that even though arabic reads from right to left, lstrip strips the start of the string (which is visually to the right)

also, as user 6502 noted, the issue in your code is because python strings are immutable, thus the function was returning the input back

score 0 · Answer 4 · answered May 23 '20 at 14:09

"ال" as prefix is quite complex in Arabic that you will need Regex to accurately separate it from its stem and other prefixes. The following code will help you isolate "ال" from most words:

import re

text = 'والشعر كالليل أسود'

words = text.split()

for word in words:
    alx = re.search(r'''^
                            ([وف])?
                            ([بك])?
                            (لل)?
                            (ال)?
                            (.*)$''', word, re.X)
    groups = [alx.group(1), alx.group(2), alx.group(3), alx.group(4), alx.group(5)]
    groups = [x for x in groups if x]
    print (word, groups)

Running that (in Jupyter) you will get:

How would I remove the Arabic prefix "ال" from an arabic string?

4 Answers4