1

I made the site Text-Files-Oriented. The site is in Hebrew, using Razor Pages, Asp.Net Core 2.
Environment: Visual Studio 2017 with all updates.

In _Layout file I have:

<meta charset="utf-8" />
<meta lang="he" dir="rtl" />

also, in site.css:

body {
    background-color:black;

    padding-top: 50px;
    padding-bottom: 20px;
    direction:rtl; /*right to left*/
    font-family: 'opensanshebrew'; /*defined above it*/
    font-size:16px;
}

In a razor page Poems, I want to simply show the first line of every txt file in "Poems" folder in wwwroot. and it goes like this:

<div class="row">
    <div id="fileListArea" class="col-lg-8">
        <h2>רשימת השירים שכתבתי:</h2>

        @foreach (var p in Model.PoemsList)
        {
            <span>@p.Title</span><br />
        }
    </div>
</div>

[I'll put it on a grid later]

in code behind:

public void OnGet()
{
    string tpath = _env.WebRootPath + "\\Poems";
    Filelist = fileTools.GetFileList(tpath);
    PoemsList = new List<PoemCover>();
    foreach(string fn in Filelist)
    {
        PoemsList.Add(new PoemCover(fileTools.GetTitle(tpath + "\\" + fn, Encoding.ASCII), fn));
    }
}

in fileTools

public static string GetTitle(string pathWfilename,Encoding encd)
{
    string rslt;

    try
    {
        using (StreamReader strm = new StreamReader(pathWfilename, encd))
        {
            string nextLine;
            rslt = strm.ReadLine();
            nextLine = strm.ReadLine();

            if (nextLine != null)
                if (nextLine.Length >= 2)
                {
                    int didx = NthOccurence(rslt, ' ', 3);
                    if (didx < 2)
                    { rslt = (rslt.Substring(0, rslt.Length - 1)) + "..."; }
                    else { rslt = (rslt.Substring(0, didx)) + "..."; }
                }
        }
    }
    catch(IOException ex)
    {
        rslt = "Error reading Title from - " + pathWfilename + " - " + ex.Message;
        Console.WriteLine("{0}", rslt);
    }

    return rslt;
}

It works but the lines are gibberish...
I've tried:

fileTools.GetTitle(tpath + "\\" + fn, Encoding.ASCII)
fileTools.GetTitle(tpath + "\\" + fn, Encoding.Unicode) 
fileTools.GetTitle(tpath + "\\" + fn, Encoding.UTF8)
fileTools.GetTitle(tpath + "\\" + fn, Encoding.UTF7) 
fileTools.GetTitle(tpath + "\\" + fn, Encoding.UTF32) 
fileTools.GetTitle(tpath + "\\" + fn, Encoding.GetEncoding("Windows-1255")) 
//which gives error of no such encoding

Some show gibberish, some shows different kinds of question marks. One shows some weird fonts...

How can I Read Hebrew text files?

Community
  • 1
  • 1
DJ5000
  • 83
  • 2
  • 8

1 Answers1

0

I have no knowledge of Hebrew language but I found a string in google to work with for testing, so here it comes:

TL;TR:

 GetTitle(@"C:\dataUpload\test.txt",Encoding.GetEncoding("windows-1255")) ; 

print of my test:

enter image description here

I used your GetTitle method just made it simpler to serve my tests.

and my test.txt file looks like this:

גליון_1

Take a note that "windows-1255" in GetEncoding starts with NON-CAPITAL letter!!

Good luck with your progress and feel free to contact me for any information.

PS. I dont understand Hebrew so in case my answer is off provide me some Hebrew strings and the expected output to work with. Also check with what encoding you have saved your txt files. I saved my .txt file as UTF-8 and now Encoding.UFT8 works too....

S.Fragkos
  • 301
  • 2
  • 9
  • System.ArgumentException occurred HResult=0x80070057 Message='windows-1255' is not a supported encoding name. For information on defining a custom encoding, see the documentation for the Encoding.RegisterProvider method. Source= StackTrace: at System.Globalization.EncodingTable.internalGetCodePageFromName(String name) ... – DJ5000 Jan 12 '18 at 10:28
  • I can send you one of the txt files – DJ5000 Jan 12 '18 at 10:30
  • Yup, send me a file to check it out. – S.Fragkos Jan 12 '18 at 10:32
  • upload somewhere and maybe send link? no other idea. – S.Fragkos Jan 12 '18 at 10:39