1

I am new to programming. These is my code:

  public string ThanglishToTamilList(char[] characters, int length) {
        var dict1 = new Dictionary<string, string>();

        dict1.Add("a", "\u0B85"); // அ
        dict1.Add("aa", "\u0B86"); // ஆ
        dict1.Add("A", "\u0B86"); // ஆ
        dict1.Add("i", "\u0B87"); // இ
        dict1.Add("ee", "\u0B88"); // ஈ
        dict1.Add("I", "\u0B88"); // ஈ
        dict1.Add("u", "\u0B89"); // உ
        ...



        List<String> list = new List<String>();
        string[] array;
        var valueOfDictOne = "";

        for (int i = 0; i < length; i++)
        {                
            try
            {
                valueOfDictOne = dict1[characters[i].ToString()];
                list.Add(valueOfDictOne);

            }
            catch
            {
                list.Add(characters[i].ToString());
            }
        }

        array = list.ToArray();
        string result = string.Join("", array);
        return result;
    }

function Parameter details:

char[] characters : Array of characters (textbox.text.ToCharArray())

int length : length of the array. (no of characters we typed in the text box)

My expected output should be:

If the user types a -> Output should be அ.

Likewise:

a -> அ

aa -> ஆ

A -> ஆ ...

note that aa & A represent same ஆ

My Problem: This code only replace one charecter (a -> அ), This works fine.

But if we type aa the output is அஅ

aa -> அஅ

But I need the correct output as

aa -> ஆ

I have added some lines of codes for this. but this did not work:

        ...
        for (int i = 0; i < length; i++)
        {                
            try
            {

                if (String.Equals(characters[i], "a") && !(String.Equals(characters[i], "aa")))
                {

                    //MessageBox.Show("a");

                    valueOfDictOne = dict1[characters[i].ToString()];
                    list.Add(valueOfDictOne);
                }
                else if (String.Equals(characters[i], "aa"))
                {
                    //MessageBox.Show("aa");

                    valueOfDictOne = dict1[characters[i].ToString()];
                    list.Add(valueOfDictOne);
                }

            }
            catch
            {
                list.Add(characters[i].ToString());
            }
        }

...

Please help me to correct this code or please provide any easy alternative ways to transliterate.

Thank you.

  • Do you have this whole thing being invoked by a keydown/keypress event? If so - it's probably just calling your function for 'a' twice... – Dave Bish May 13 '13 at 07:57
  • 1
    I think `String.Equals(character[i], "aa")` will always be false, since one character will never be equal to two characters – Jarek May 13 '13 at 08:18
  • 1. Though you have added some lines of code, but the second code snippet is exactly the same as that of the first one. 2. The code you have specified here has no problems. Please check the parameters that you are passing to this method. – neo May 13 '13 at 08:10
  • Thnq Pako. You are correct. – Sutharshan Suthan May 13 '13 at 12:53

2 Answers2

1

You can use a simple parser/lexer to tokenize the input string. Then your ThanglishToTamilList function would be like:

...
TextReader r = new StringReader(characters);
Lexer l = new Lexer(r, defs);
while (l.Next())
{
  list.Add(dict1[l.TokenContents]);
}
...

You can find an example of a simple parser/lexer here: Poor man's "lexer" for C#

It is probably overkill for your problem, but it should get the job done.

Community
  • 1
  • 1
Dietz
  • 578
  • 5
  • 14
0

I think you should change your approach completly to solve this problem efficiently. Basing on one character is giving you invalid results as some sequences start with already valid sequences present in dictionary.

What I think you should do is to add characters to some temporary string as long as there is valid dictionary entry. If next character appended to this temporary string will not be present in dictionary, then you process string substitution and start with new temporary string.
Of course this solution is not perferct in some ways - if we have string aaa how should it be processed? Solution assumes taking longest matching expressions first (first point), but it is not necessarly valid approach.

  • aa + a?
  • a + aa?
  • a + a + a?

But you will need to solve this somehow on business level.

Example pseudo-code below:

foreach(char character in characters)
{
    if (!substitutionDict.ContainsKey(tempString + character))
    {
        makeSubstitution(tempString, substitutionDict[tempString]);
        tempString = String.Empty;
    }
    tempString += character;
}

Edit:
This approach presented is mostly sutable for processing while typing. I'm not sure about performance of such solution for longer files. In processing files that are already created it may be better to look at it the other way around: look for patterns and substitute them.

foreach (string pattern in substitutionDict.Keys.OrderByDesc(x => x.Length))
{
    makeSubstitution(pattern, substitutionDict[pattern]);
}
Jarek
  • 3,359
  • 1
  • 27
  • 33
  • Is there any .dll available for transliteration? – Sutharshan Suthan May 13 '13 at 13:36
  • Never had to do this so not sure. http://stackoverflow.com/questions/10027001/does-net-transliteration-library-exists this may be what you are looking for. If not - google it, something may be there. And if your requirements are not to complex - creating something yourself won't be that hard anyway – Jarek May 13 '13 at 19:12