Questions tagged [transliteration]

Transliteration refers to the process of mapping letters or glyphs from one character encoding to another

Transliteration is the conversion of letters from one alphabet to another one, like from Greek to Latin. But it may as well be just a simplification within one alphabet, for example omitting any diacritics found in that alphabet or substituting special characters with a sequence of characters without diacritics.

257 questions
91
votes
12 answers

Remove diacritical marks (ń ǹ ň ñ ṅ ņ ṇ ṋ ṉ ̈ ɲ ƞ ᶇ ɳ ȵ) from Unicode chars

I am looking at an algorithm that can map between characters with diacritics (tilde, circumflex, caret, umlaut, caron) and their "simple" character. For example: ń ǹ ň ñ ṅ ņ ṇ ṋ ṉ ̈ ɲ ƞ ᶇ ɳ ȵ --> n á --> a ä --> a ấ --> a ṏ -->…
flybywire
  • 261,858
  • 191
  • 397
  • 503
55
votes
6 answers

Character Translation using Python (like the tr command)

Is there a way to do character translation / transliteration (kind of like the tr command) using Python? Some examples in Perl would be: my $string = "some fields"; $string =~ tr/dies/eaid/; print $string; # domi failed $string = 'the cat sat on…
hhafez
  • 38,949
  • 39
  • 113
  • 143
46
votes
11 answers

How do you map-replace characters in Javascript similar to the 'tr' function in Perl?

I've been trying to figure out how to map a set of characters in a string to another set similar to the tr function in Perl. I found this site that shows equivalent functions in JS and Perl, but sadly no tr equivalent. the tr (transliteration)…
qodeninja
  • 10,946
  • 30
  • 98
  • 152
36
votes
14 answers

Cyrillic transliteration in PHP

How to transliterate cyrillic characters into latin letters? E.g. Главная страница -> Glavnaja stranica This Transliteration PHP Extension would do this very well, but I can't install it on my server. It would be best to have the same…
Sfisioza
  • 3,830
  • 6
  • 42
  • 57
26
votes
5 answers

Romanization of Unicode text

I am looking for a way to transliterate Unicode letter characters from any language into accented Latin letters. The intent is to allow foreigners to gain insight into the pronunciation of names and words written in any non-Latin…
Anthony Faull
  • 17,549
  • 5
  • 55
  • 73
25
votes
10 answers

How to transliterate Cyrillic to Latin text

I have a method which turns any Latin text (e.g. English, French, German, Polish) into its slug form, e.g. Alpha Bravo Charlie => alpha-bravo-charlie But it can't work for Cyrillic text (e.g. Russian), so what I'm wanting to do is transliterate the…
ckknight
  • 5,953
  • 4
  • 26
  • 23
25
votes
2 answers

Convert accented characters into ascii character

What is the optimal way to to remove German (or French) accents from a vector of 16 million string variables. e.g., 'Sjögren's syndrome' into 'Sjogren's syndrome' Converstion of single character into a single character is better then transliteration…
userJT
  • 11,486
  • 20
  • 77
  • 88
24
votes
6 answers

Transliterate any convertible utf8 char into ascii equivalent

Is there any good solution out there that does this transliteration in a good manner? I've tried using iconv(), but is very annoying and it does not behave as one might expect. Using //TRANSLIT will try to replace what it can, leaving everything…
Ivan Hušnjak
  • 3,493
  • 3
  • 20
  • 30
23
votes
4 answers

use string.translate in Python to transliterate Cyrillic?

I'm getting UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-51: ordinal not in range(128) exception trying to use string.maketrans in Python. I'm kinda discouraged with this kind of error in following code (gist): # -*-…
Nemoden
  • 8,816
  • 6
  • 41
  • 65
22
votes
4 answers

Python and character normalization

Hello I retrieve text based utf8 data from a foreign source which contains special chars such as u"ıöüç" while I want to normalize them to English such as "ıöüç" -> "iouc" . What would be the best way to achieve this ?
Hellnar
  • 62,315
  • 79
  • 204
  • 279
22
votes
8 answers

transliterating cyrillic to latin with javascript function

I made this function: function transliterate(word){ var answer = ""; A = new Array(); …
kyng
  • 457
  • 1
  • 6
  • 13
19
votes
10 answers

PHP Transliteration

Are there any solutions that will convert all foreign characters to A-z equivalents? I have searched extensively on Google and could not find a solution or even a list of characters and equivalents. The reason is I want to display A-z only URLs,…
user137621
16
votes
3 answers

icu4j cyrillic to latin

I'm trying to get Cyrillic words to be in latin so I can have them in urls. I use icu4j transliterator, but it still gives weird characters like this: Vilʹândimaa. It should be more like viljandimaa. When I copy that url these letters turn to %..…
ivar
  • 819
  • 4
  • 12
  • 19
16
votes
1 answer

Transliteration from Cyrillic to Latin ICU4j java

I need to do something rather simple but without hash mapping hard coding. I have a String s and it is in Cyrillic I need some sort of example on how to turn it into Latin characters using a custom filter of a sort (to give a purely Latin example as…
Boris Gaganelov
  • 327
  • 1
  • 3
  • 16
14
votes
2 answers

Perl 6: Backslashes in transliteration (tr///)

I noticed while experimenting with tr///, that it doesn't seem to translate backslashes, even when escaped. For example, say TR"\^/v"." given 'v^/\\'; say TR"\\^/v"." given 'v^/\\'; say TR"\ ^/v"." given 'v^/\\'; All of them output ...\ rather than…
Jo King
  • 590
  • 3
  • 17
1
2 3
17 18