Could it be possible to import all the possible letters (lowercase, uppercase, etc.) in an alphabet in a certain language (Turkish, Polish, Russian, etc.) as a python list? Is there a certain module to do that?
2 Answers
Your question ties into a larger problem - how alphabets of certain languages are stored in a computer, how they are represented, and (eventually) how they can be retrieved in Python?
I suggest you read:
- The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!)
- Unicode & Character Encodings in Python: A Painless Guide
The short answer is - "yes". But it depends on what you actually name an alphabet of the language (e.g. in some languages there are specific characters for punctuation. do you consider them as part of the alphabet in your application?) What do you need it for? If it is about language detection then there is a duplicate question. Your question is generic and without details and (best) a snippet it will be difficult to be answered satisfactorily for you.

- 14,672
- 11
- 46
- 75
-
Hi, thanks for the reply. Yes I would like to. Furthermore, let me explain to you my problem. I have a string it has been marked as a certain language such as Turkish. The string could contain characters other than Turkish. I wan to keep all the Turkish words and remove the rest. So, what I could do is to get all the unique characters from the string, remove the Turkish ones and get the non Turkish stuff and remove them from the string. For that I require the All possible words in Turkish. Thanks & Best Regards Michael – Alain Michael Janith Schroter Apr 13 '20 at 07:21
-
This is a completely different question. I suggest you post it along with the code that you tried. – sophros Apr 13 '20 at 07:49
-
1"All possible words in Turkish" is different from "All words that can be formed using the Turkish alphabet". besides both being a very large number of words – devio Apr 13 '20 at 09:14
If I have understood your question, you want a list with all letters from an alphabet. A possible solution may be:
- get a string with the full alphabet you need
- use a set() to transform the string in a collection of unique, non ordered elements.
Then you can use the collection to do a lot of things, as explained in docs.python.org, section 5.4:
a = set('abracadabra')
b = set('alacazam')
a # unique letters in a
{'a', 'r', 'b', 'c', 'd'}
a - b # letters in a but not in b
{'r', 'd', 'b'}
a | b # letters in a or b or both
{'a', 'c', 'r', 'd', 'b', 'm', 'z', 'l'}
a & b # letters in both a and b
{'a', 'c'}
a ^ b # letters in a or b but not both
{'r', 'd', 'b', 'm', 'z', 'l'}

- 409
- 1
- 13
- 22