3

Reseting a StreamReader leads to strange behaviour. The first assert succeeds whereas the second fails. To correct it one (bad) solution consists in reseting to position 3 instead of 0: sr.BaseStream.Position = 3;

using (var sr = new StreamReader(@"c:\temp\test.txt", Encoding.UTF8)) // test.txt is encoded in UTF8
{
     var read = sr.ReadLine();
     Assert.AreEqual("fromfile", read); // ok
     sr.BaseStream.Position = 0;
     sr.DiscardBufferedData();
     read = sr.ReadLine();
     Assert.AreEqual("fromfile", read); //fails
}
Soner Gönül
  • 97,193
  • 102
  • 206
  • 364
sthiers
  • 3,489
  • 5
  • 34
  • 47
  • What *does* it contain? You are resetting the *stream*, not the *reader*. Could it be that the first 3 bytes are the BOM and the *reader* automatically skips it? – Panagiotis Kanavos Mar 04 '15 at 15:04
  • Yes, I assume something like that is happening (but I wish I understood better). It was Jon Skeet's suggestion is this thread: http://stackoverflow.com/questions/831417/how-do-you-reset-a-c-sharp-net-textreader-cursor-back-to-the-start-point/831436?noredirect=1#comment45980418_831436 – sthiers Mar 04 '15 at 15:15
  • Have you tried commenting out either the `sr.BaseStream.Position` __or__ the `sr.DiscaredBufferedData();` lines to see which is causing your problem? Perhaps it's one or the other, or maybe the combination is causing an issue. – krillgar Mar 04 '15 at 15:32
  • As an aside, this problem is yet another excellent demonstration of why misusing the BOM as a marker for UTF-8 is a bad idea and not recommended by the Unicode standard. If the file is under your control, consider not using a BOM and assume all files you read are UTF-8 unless proven otherwise. – Jeroen Mostert Mar 04 '15 at 15:53
  • @krillgar: commenting out sr.DiscaredBufferedData(); does not help – sthiers Mar 05 '15 at 08:01

1 Answers1

4

You just didn't manage to truly reset the object. There's a private field in the class named _checkPreamble. It will be set to false since it was already checked. You can hack it:

using System.Reflection;
...
     var fi = typeof(StreamReader).GetField("_checkPreamble", BindingFlags.NonPublic | BindingFlags.Instance);
     fi.SetValue(sr, true);
     read = sr.ReadLine();
     Assert.AreEqual("fromfile", read); // okay now

Of course you don't really want to write code like this. The solution is very trivial, just create a new StreamReader object.

Hans Passant
  • 922,412
  • 146
  • 1,693
  • 2,536
  • Or, if the OP is certain the file contains a BOM, just call `sr.Read();` – Panagiotis Kanavos Mar 04 '15 at 15:43
  • @Hans: Yes this works. My conclusion is that you cannot reset the object properly, it's simply not meant to be used liked that. – sthiers Mar 05 '15 at 08:07
  • @Hans: another way could be (but still not clean): `sr.BaseStream.Position = sr.CurrentEncoding.GetPreamble().Length; ` – sthiers Mar 05 '15 at 08:52
  • No, major fail whale if the file does not in fact contain a BOM. Which is quite common, *nix programmers insist that it isn't necessary. Mostly because their shell tends to fall over pretty badly when it does, the shebang problem. – Hans Passant Mar 05 '15 at 09:59