I came across this earlier today that was not sure why it happens.
I have the following code that sets the internal position of the file stream to a location so I can read the number of lines from that position. It is similar to this other post but when I used stream.Seek
I see strange results
StringBuilder b = new StringBuilder();
using(var stream = _streamFactory.CreateStream())
using (var streamReader = new System.IO.StreamReader(stream, _streamFactory.Encoding))
{
stream.Seek(startPosition, System.IO.SeekOrigin.Begin);
string value;
for (int i = 0; i < lines; i++)
{
if ((value = streamReader.ReadLine()) != null)
{
b.AppendLine(value);
}
}
}
Now what I am doing is reading a file using the UTF-8 encoding so I know there are extra bits at the start of the file that denote this but are not part of the text I want to extract.
Say for eample I have the following text in the file
Hello my name is bob
So if I set startPosition
to 0 my results will be Hello my name is bob however when I set startPosition
to 1 I dont get ello my name is bob but rather @@Hello my name is bob where @@ are 2 bytes from the encoding bits.
So my question is why when I set .Seek(0)
and then do a ReadLine
I get the correct line but Seek(1)
will return the 2nd and 3rd bytes of the encoding?
Seek(3)
will also yield the same results as Seek(0)
. If this was consistent I would have thought Seek(0)
would return @@@Hello my name is bob
Also how do I know how many extra bytes are at the start of the file without reading it (but knowing the encoding)?
I tried looking at the disassembled code and had to stop before my brain went on strike.
Note:
The Streambuilder in this case is just creating a FileStream
. I do this so I can Unit test this code using a MemoryStream