4

I have some C# code that is working with a text file, and i can not seem to get it to work properly with blank or null (whitespace) lines.

My code:

        while (!file.EndOfStream)
        {
            line = file.ReadLine();

            bool isComment = (line[0] == '/') && (line[1] == '/');
            bool isPoint = (line[0] == '(') && (line[line.Length - 1] == ')');
            bool isWhiteSpace = string.IsNullOrEmpty(line);

            Debug.Log("Comment: " + isComment + ", Point: " + isPoint + ", WhiteSpace: " + isWhiteSpace + "Value: '" + line + "'");

            if (!isComment && !isPoint && !isWhiteSpace) { Application.Quit(); }
            else if (isPoint)
            {
                //Strip parenthesis
                line = line.Remove(line.Length - 1, 1).Remove(0, 1);

                //break into float array
                string[] arr = line.Split(',');

                float xVal = float.Parse(arr[0]);
                float yVal = float.Parse(arr[1]);
                float zVal = float.Parse(arr[2]);

                Vector3 currentVector = new Vector3(xVal, yVal, zVal);
                results.Add(currentVector);
            }
        }

You can see that i happen to be doing things with Vector3. If the line is a comment line or a whitespace line, i want it to do nothing. If it notices parentheses, i want it to assume it is a Vector3 and parse it. Finally, if it is a line that is none of these, i want it to stop entirely. Here is a sample text file that i have created just with Notepad:

//This is a comment
// ... and so is this!
(0, -1.5, 3)
(1, 4, 1.23)

(3, 5, 2)

Notice that there is a gap between the second and third Vector3. In this particular case, the line is completely empty, it does not contain spaces or anything, i simply pressed [Enter][Enter] in Notepad. When my script reaches this line, it seems to trigger the file.EndOfStream boolean.... but its NOT the end of the file! How can i fix this? Is there a more appropriate condition for my while loop? I have also tried reading the line in and checking if it is null as the while condition, which is a more popular way to approach this, but this way of doing things also does not work for my case.

** Note: "file" is a variable of type StreamReader **

Drifter64
  • 1,103
  • 3
  • 11
  • 30
  • 7
    An empty string and a string composed entirely of white space aren't the same thing. A new line is a whitespace character, use `string.IsNullOrWhitespace` if you want to check for strings that are empty, or null, *or* purely whitespace. – Preston Guillot Nov 29 '15 at 22:57
  • I think it doesnt trigger file.EndOfStream but it raise exception and you just eat that exception with empty catch block. That happens because you first check `line[0]` but if line is empty you are out of range. If performance isn't important, you can try regex. – Posix Nov 29 '15 at 23:02
  • Okay, i did not know this.... so an "Empty" line is something like this: string line = ""; ? I think, if your definition holds, i should never get an empty line in a file, only whitespace or null. Is my thinking correct? – Drifter64 Nov 29 '15 at 23:02
  • I doubt it's failing on the `EndOfStream` check, it's almost definitely quitting because of `if (!isComment && !isPoint && !isWhiteSpace) { Application.Quit(); }` – Rob Nov 29 '15 at 23:02
  • Steve: i want the program to quit alltogether if a line is not a comment, whitespace/blank, or Vector3. – Drifter64 Nov 29 '15 at 23:05
  • Great thinking all of you. I think somewhere the answer is in this. I will do some tests, i think it has to do with David's answer with line[0], so i will use prestons thoughts about IsNullOrWhitespace first to check the line before the other booleans. – Drifter64 Nov 29 '15 at 23:06
  • It appears @DavidKosorin is correct, the boolean for isNullOrEmpty returns true now, and then in the next line of code i check 'line[0]' which throws an index out of bounds exception. David, you may submit your solution as an answer if you like. **NOTE: there is no such thing as "string.isNullOrWhitespace" that i can see. At least, visual studio says it's not a proper method. ** – Drifter64 Nov 29 '15 at 23:13
  • https://msdn.microsoft.com/en-us/library/system.string.isnullorwhitespace(v=vs.110).aspx – Preston Guillot Nov 29 '15 at 23:15
  • Yeah... i saw the documentation. I'm not sure how this works but i have Microsoft Visual Studio free version, as part of my installation of Unity Game Engine. Perhaps i have an old .NET framework. When i type 'string.is' i only get the options 'string.isInterned' and 'string.isNullOrEmpty'. – Drifter64 Nov 29 '15 at 23:24

2 Answers2

8

This is more of a style note than an answer, although this also will prevent the issues you were seeing.

First, with a StreamReader when you call ReadLine you will only receive a null result when you reach the end of the file. You also don't care about whitespace at the start and end of your lines, and presumably don't care about lines that are entirely whitespace either. So you can use that to test for end of file and empty lines this way:

string line;
while ((line = file.ReadLine()) != null)
{
    line = line.Trim();
    if (line == "")
        continue;
}

Next you have some tests for start/end characters that is still going to cause problems in some situations. Specifically, reading the second character in a line that has only one character is going to cause an exception.

Instead of using indexing on a string of untested length you can use the StartsWith and EndsWith methods to do your tests:

bool isComment = line.StartsWith("//");
bool isPoint = line.StartsWith("(") && line.EndsWith(")");

Finally, in your code that parses the point value you assume that any line that starts with ( and ends with ) will have at least 2 commas in it, and that the text will correctly parse. This is a bad assumption.

The better way to handle all of this is to detect and deal with each case as you go, with the parse functionality broken out to a method you can reuse

Here's my version:

public class Program
{
    public static void Main()
    {
        List<Vector3> results = new List<Vector3>();
        using (var file = System.IO.File.OpenText(@"C:\temp\test.txt"))
        {
            string line;
            while ((line = file.ReadLine()?.Trim()) != null)
            {
                // skip empty lines and comments
                if (line == string.Empty || line.StartsWith("//"))
                    continue;
                // parse all other lines as vectors, exit program on error
                try
                {
                    Vector3 vector = ParseVector(line);
                    results.Add(vector);
                }
                catch (FormatException e)
                {
                    Console.WriteLine("Parse error on line: {0}", line);
                    throw;
                }
            }
        }

        foreach (var v in results)
            Console.WriteLine("({0},{1},{2})", v.X, v.Y, v.Z);
    }

    // parse string in format '(x,y,z)', all as floats
    // throws FormatException on any error
    public static Vector3 ParseVector(string text)
    {
        if (!text.StartsWith("(") || !text.EndsWith(")"))
            throw new FormatException();
        string[] parts = text.Substring(1, text.Length - 1).Split(',');
        if (parts.Length != 3)
            throw new FormatException();
        float x = float.Parse(parts[0]);
        float y = float.Parse(parts[1]);
        float z = float.Parse(parts[2]);
        return new Vector3(x, y, z);
    }
}

If you prefer not to use exceptions you could return null or use the pattern used by the TryParse methods, returning a boolean success/failure indicator and using an out parameter to write the results to. I prefer exceptions in this case.

Corey
  • 15,524
  • 2
  • 35
  • 68
0

David was correct. I was reaching an index out of bounds exception. Below is my corrected and working code:

        while (!file.EndOfStream)
        {
            line = file.ReadLine();

            bool isWhiteSpace = false;
            bool isComment = false;
            bool isPoint = false;

            isWhiteSpace = string.IsNullOrEmpty(line);

            if (!isWhiteSpace)
            {
                isComment = (line[0] == '/') && (line[1] == '/');
                isPoint = (line[0] == '(') && (line[line.Length - 1] == ')');
            }
            Debug.Log("Comment: " + isComment + ", Point: " + isPoint + ", WhiteSpace: " + isWhiteSpace + "Value: '" + line + "'");

            if (!isComment && !isPoint && !isWhiteSpace) { Application.Quit(); }
            else if (isPoint)
            {
                //Strip parenthesis
                line = line.Remove(line.Length - 1, 1).Remove(0, 1);

                //break into float array
                string[] arr = line.Split(',');

                float xVal = float.Parse(arr[0]);
                float yVal = float.Parse(arr[1]);
                float zVal = float.Parse(arr[2]);

                Vector3 currentVector = new Vector3(xVal, yVal, zVal);
                results.Add(currentVector);
            }
        }
Drifter64
  • 1,103
  • 3
  • 11
  • 30
  • You might post this to code review to have some improvements suggested. – usr Nov 29 '15 at 23:22
  • I am aware there are some inefficiencies. I am just happy to find the more fatal of the problems. This is not production code, and is made for my personal use. – Drifter64 Nov 29 '15 at 23:25
  • 1
    I was suggesting making use of CR for learning and personal development. Much more important than improving a single piece of code. – usr Nov 29 '15 at 23:26
  • fair enough. Where can i submit this? I presume you are talking about a specific location here on SO. – Drifter64 Nov 29 '15 at 23:28
  • There is a Code Review Stack Exchange site. Google for it. Also observe the particular rules of that site. You might leave a link to your submission here for the interested. – usr Nov 29 '15 at 23:29
  • Please read a [Guide to Code Review for Stack Overflow users](http://meta.codereview.stackexchange.com/questions/5777/a-guide-to-code-review-for-stack-overflow-users) – holroy Nov 29 '15 at 23:31