Is it posible to convert Cyrillic string to English(Latin) in c#? For example I need to convert "Петролеум" in "Petroleum". Plus I forgot to mention that if I have Cyrillic string it need to stay like that, so can I somehow check that?
-
2Hey, just found out something, that might be important for you too. If you are transliterating official text (like addresses of advert clients or whatever) you need to check whether there is a special law for transliteration with a transliteration table included. Bulgaria, for instance, has such a law and any misuse could lead to legal issues. Apart from the table, there might be described exceptions of the rule, that you need to follow too. Like България is Bulgaria, not Balgariya. – vlood Sep 14 '10 at 09:26
10 Answers
I'm not familiar with Cyrillic, but if it's just a 1-to-1 mapping of Cyrillic characters to Latin characters that you're after, you can use a dictionary of character pairs and map each character individually:
var map = new Dictionary<char, string>
{
{ 'П', "P" },
{ 'е', "e" },
{ 'т', "t" },
{ 'р', "r" },
...
}
var result = string.Concat("Петролеум".Select(c => map[c]));

- 213,145
- 36
- 401
- 431
-
I was trying to avoid that, but thanks :) I thought if there was some cleaner way from .Net or c#. – Pece Jun 20 '10 at 13:47
-
@Pece: I'm not aware of a built-in method that does this... BTW, if performance is a concern, use a char[] or StringBuilder instead of LINQ. – dtb Jun 20 '10 at 13:57
-
5It is not ch to ch mapping. You need multiple Latin characters for some Cyrillic characters. – PauliL Jun 20 '10 at 13:58
-
This method is very fast:
static string[] CyrilicToLatinL =
"a,b,v,g,d,e,zh,z,i,j,k,l,m,n,o,p,r,s,t,u,f,kh,c,ch,sh,sch,j,y,j,e,yu,ya".Split(',');
static string[] CyrilicToLatinU =
"A,B,V,G,D,E,Zh,Z,I,J,K,L,M,N,O,P,R,S,T,U,F,Kh,C,Ch,Sh,Sch,J,Y,J,E,Yu,Ya".Split(',');
public static string CyrilicToLatin(string s)
{
var sb = new StringBuilder((int)(s.Length * 1.5));
foreach (char c in s)
{
if (c >= '\x430' && c <= '\x44f') sb.Append(CyrilicToLatinL[c - '\x430']);
else if (c >= '\x410' && c <= '\x42f') sb.Append(CyrilicToLatinU[c - '\x410']);
else if (c == '\x401') sb.Append("Yo");
else if (c == '\x451') sb.Append("yo");
else sb.Append(c);
}
return sb.ToString();
}

- 465
- 4
- 8
You can use text.Replace(pair.Key, pair.Value)
function.
using System;
using System.Collections.Generic;
using System.ComponentModel;
using System.Data;
using System.Drawing;
using System.Text;
using System.Windows.Forms;
namespace Transliter
{
public partial class Form1 : Form
{
Dictionary<string, string> words = new Dictionary<string, string>();
public Form1()
{
InitializeComponent();
words.Add("а", "a");
words.Add("б", "b");
words.Add("в", "v");
words.Add("г", "g");
words.Add("д", "d");
words.Add("е", "e");
words.Add("ё", "yo");
words.Add("ж", "zh");
words.Add("з", "z");
words.Add("и", "i");
words.Add("й", "j");
words.Add("к", "k");
words.Add("л", "l");
words.Add("м", "m");
words.Add("н", "n");
words.Add("о", "o");
words.Add("п", "p");
words.Add("р", "r");
words.Add("с", "s");
words.Add("т", "t");
words.Add("у", "u");
words.Add("ф", "f");
words.Add("х", "h");
words.Add("ц", "c");
words.Add("ч", "ch");
words.Add("ш", "sh");
words.Add("щ", "sch");
words.Add("ъ", "j");
words.Add("ы", "i");
words.Add("ь", "j");
words.Add("э", "e");
words.Add("ю", "yu");
words.Add("я", "ya");
words.Add("А", "A");
words.Add("Б", "B");
words.Add("В", "V");
words.Add("Г", "G");
words.Add("Д", "D");
words.Add("Е", "E");
words.Add("Ё", "Yo");
words.Add("Ж", "Zh");
words.Add("З", "Z");
words.Add("И", "I");
words.Add("Й", "J");
words.Add("К", "K");
words.Add("Л", "L");
words.Add("М", "M");
words.Add("Н", "N");
words.Add("О", "O");
words.Add("П", "P");
words.Add("Р", "R");
words.Add("С", "S");
words.Add("Т", "T");
words.Add("У", "U");
words.Add("Ф", "F");
words.Add("Х", "H");
words.Add("Ц", "C");
words.Add("Ч", "Ch");
words.Add("Ш", "Sh");
words.Add("Щ", "Sch");
words.Add("Ъ", "J");
words.Add("Ы", "I");
words.Add("Ь", "J");
words.Add("Э", "E");
words.Add("Ю", "Yu");
words.Add("Я", "Ya");
}
private void button1_Click(object sender, EventArgs e)
{
string source = textBox1.Text;
foreach (KeyValuePair<string, string> pair in words)
{
source = source.Replace(pair.Key, pair.Value);
}
textBox2.Text = source;
}
}
}
If you change
cryllic to latin:
text.Replace(pair.Key, pair.Value);
latin to cryllic
source.Replace(pair.Value,pair.Key);

- 12,262
- 10
- 69
- 70
-
what if some latin characters also came as an input? I have something like "унесите град (city)" and its been converted as "unesite grad (situ)". The "SITU" part is wrong... or when I have some company names like "AUIМРЕХ", its been converted to "AUIMREH" how to handle those cases? – sosNiLa Feb 28 '22 at 17:32
You can of course map the letters to the latin transcription, but you won't get an english word out of it in most cases. E.g. Российская Федерация transcribes to Rossiyskaya Federatsiya. wikipedia offers an overview of the mapping. You are probably looking for a translation service, google probably offers an api for that.

- 60,705
- 7
- 138
- 176
If you're using Windows 7, you can take advantage of the new ELS (Extended Linguistic Services) API, which provides transliteration functionality for you.
Have a look at the Windows 7 API Code Pack - it's a set of managed wrappers on top of many new API in Windows 7 (such as the new Taskbar). Look in the Samples
folder for the Transliterator
example, you'll find it's exactly what you're looking for:

- 31,174
- 15
- 92
- 157
http://code.google.com/apis/ajaxlanguage/documentation/#Transliteration
Google offer this AJAX based transliteration service. This way you can avoid computing transliterations yourself and let Google do them on the fly. It'd mean letting the client-side make the request to Google, so this means your app would need to have some kind of web-based output for this solution to work.

- 2,931
- 2
- 24
- 33
Use a Dictionary with russian and english words as a lookup table. It'll be a lot of typing to build it, but it's full proof.

- 1,211
- 2
- 18
- 29
-
3not really. If google can't produce a fool proof dictionary, he can't either. – Femaref Jun 20 '10 at 14:32
Why do you want to do this? Changing characters one-for-one generally doesn't even produce a reasonable transliteration, much less a translation. You may find this post to be of interest.

- 21,988
- 13
- 81
- 109

- 617
- 7
- 15
You are searching for a way of translitterating russian words written in cirillic (in some encodings, e.g. even a Latin encoding, since iso 8859-5 aka Latin-5 is for cyrillic) into latin alphabet (with accents)?
I don't know if .NET has something to transliterate, but I dare say it (as many other good frameworks) hasn't. This wikipedian link could give you some ideas to implement translitteration, but it is not the only way and remember tha cyrillic writing systems is not used by russian only and the way you apply translitteration may vary on the language that use the writing system. E.g. see the same for bulgarian. May this link (always from wp) can be also interesting if you want to program the translitterator by yourself.

- 9,432
- 1
- 29
- 39
This is solution for serbian cyrillic-latin transliteration for form like this: form
namespace WindowsFormsApplication1
{
public partial class Form1 : Form
{
Dictionary<string, string> slova = new Dictionary<string, string>();
public Form1()
{
InitializeComponent();
slova.Add("Љ", "Lj");
slova.Add("Њ", "Nj");
slova.Add("Џ", "Dž");
slova.Add("љ", "lj");
slova.Add("њ", "nj");
slova.Add("џ", "dž");
slova.Add("а", "a");
slova.Add("б", "b");
slova.Add("в", "v");
slova.Add("г", "g");
slova.Add("д", "d");
slova.Add("ђ", "đ");
slova.Add("е", "e");
slova.Add("ж", "ž");
slova.Add("з", "z");
slova.Add("и", "i");
slova.Add("ј", "j");
slova.Add("к", "k");
slova.Add("л", "l");
slova.Add("м", "m");
slova.Add("н", "n");
slova.Add("о", "o");
slova.Add("п", "p");
slova.Add("р", "r");
slova.Add("с", "s");
slova.Add("т", "t");
slova.Add("ћ", "ć");
slova.Add("у", "u");
slova.Add("ф", "f");
slova.Add("х", "h");
slova.Add("ц", "c");
slova.Add("ч", "č");
slova.Add("ш", "š");
}
// Method for cyrillic to latin
private void button1_Click(object sender, EventArgs e)
{
string source = textBox1.Text;
foreach (KeyValuePair<string, string> pair in slova)
{
source = source.Replace(pair.Key, pair.Value);
// For upper case
source = source.Replace(pair.Key.ToUpper(),
pair.Value.ToUpper());
}
textBox2.Text = source;
}
// Method for latin to cyrillic
private void button2_Click(object sender, EventArgs e)
{
string source = textBox2.Text;
foreach (KeyValuePair<string, string> pair in slova)
{
source = source.Replace(pair.Value, pair.Key);
// For upper case
source = source.Replace(pair.Value.ToUpper(),
pair.Key.ToUpper());
}
textBox1.Text = source;
}
}
}

- 71
- 1
- 4
-
If "lj", "nj" and "dž" are not in the begining of the dictionary, it would be translated as "лј", "нј" and "дж" instead of "љ", "њ" and "џ". Also, distionary should have upper case "Љ", "Њ" and "Џ", because without that, it would be translated as "LJ", "NJ" and "DŽ", instead of "Lj", "Nj" and "Dž". Other upper case chars can be done with ToUpper() method. – Bojan Jovanovic Feb 22 '17 at 21:30