-2

I have a text file being read into my Visual Studion 2012 program that has letters such as ÀÁÄàáäÈÉËèéëÌÍÏìíïÒÓÖòóöÙÚÜùúü, which I'd like to convert to regular English letters such as A, E, I, O, U (case is irrelevant). When my code encounters one of these, I see it in debug as a black diamond with a question in it. Various methods to decode the letter into something where I can then use Regex.Replace() haven't been successful, or I haven't found the right technique yet while searching the net.

Thanks for any help.

  • 1
    You'll need to know how the file is encoded. You'll also need to know how the file is read into your program. You've only given a vague description and left us to try to guess what your code might be. I think you need to take a little more time over your question and try to imagine how it looks from our perspective. That is imagine that you cannot see your code and all that can be seen are the words above. Also, one wonders why you would wish to settle for such a poor solution. Why not do it right? Why not read the characters properly? What's wrong with non-English text? – David Heffernan Apr 20 '14 at 18:59
  • 1
    Possible duplicate of http://stackoverflow.com/questions/249087/how-do-i-remove-diacritics-accents-from-a-string-in-net?lq=1 and several others. – ClickRick Apr 20 '14 at 19:04
  • David wrote: You'll need to know how the file is encoded. Let me make that __You'll NEED to know how the file is encoded.__ – TaW Apr 20 '14 at 19:07
  • Sorry, this seemed like a straight forward "trap and convert" problem. How does one determine how the file was encoded when it is just handed over with no information? The file is being read into my program with a string fileData = File.ReadAllText(fileName); – user3554581 Apr 21 '14 at 13:51
  • Here's the rest of my post (please forgive me, this is my first time posting to this site): This started out as a homework assignment to count alpha letters in a large text file, but having encountered the foreign characters, I wanted to investigate further. You are right, I should count the characters and display them properly...I just need to figure out how do to that. – user3554581 Apr 21 '14 at 14:00

1 Answers1

0

I solved the problem by simply opening the .txt file in Notepad and doing a SaveAs a UTF-8 file type. Now my program recognizes the characters, counts the correctly, and displays them properly on my form.