4

I´m trying to use the .net API to seek in a large data file. For some reason I am unable to make it work. Here is my code:

function check_logs{
  $pos = 8192
  $count = 1
  $path = 'C:\Program Files\Microsoft SQL Server\MSSQL10.MSSQLSERVER\MSSQL\Log\ERRORLOG.2'
  $br = 0
  $reader = [System.IO.File]::OpenText($path)
  $reader.DiscardBufferedData()
  $reader.BaseStream.Seek(0, [System.IO.SeekOrigin]::Begin)
    for(;;){
    $line = $reader.ReadLine()
    if($line -ne $null){$br = $br + [System.Text.Encoding]::UTF8.GetByteCount($line)}
    if($line -eq $null -and $count -eq 0){break}
    if($line -eq $null){$count = 0}
    elseif($line.Contains('  Error:')){
        Write-Host "$line  $br"
    }
}

}

If I use 0 as a parameter for the seek method it seeks from the beginning as expected but it also writes 0 out to the console before it writes the lines read. Example:

 0
 2011-08-31 09:26:36.31 Logon       Error: 17187, Severity: 16, State: 1.  4101
 2011-08-31 09:26:36.32 Logon       Error: 17187, Severity: 16, State: 1.  4489
 2011-08-31 09:26:38.25 Logon       Error: 17187, Severity: 16, State: 1.  4929
 2011-08-31 09:26:38.25 Logon       Error: 17187, Severity: 16, State: 1.  5304
 2011-08-31 09:26:43.75 Logon       Error: 17187, Severity: 16, State: 1.  6120

If I try to seek using 4096 instead of 0 it only writes out:

4096

I would have thought it would write out the same lines as the first one did apart from the first two.

Can someone see the problem? I had another question that got me to this. For further background see this

EDIT: Still trying to figure this out. Does anyone know where else I could try to find information regarding this problem? Is it possible to send questions to the Microsoft scripting guy?

Best regards

Gísli

Community
  • 1
  • 1
Gisli
  • 734
  • 2
  • 11
  • 34

3 Answers3

5

The Seek method returns the new position within the stream, which is why you are having a number printed out.

As to why you are not getting an output:

  1. Confirm the file is greater than 4K in size.
  2. Try printing out all lines, rather than just lines with the word "Error" in them. That might give you a clue
  3. StreamReader is a buffered wrapper around the base stream, so Seek and Position may not work quite like you expect. Consider http://geekninja.blogspot.com/2007/07/streamreader-annoying-design-decisions.html. Try adding in a call to $reader.DiscardBufferedData() before the seek.
RB.
  • 36,301
  • 12
  • 91
  • 131
  • @RB - Thanks for the reply. It doesn't matter what I use instead of 0, if I put 1 the output is 1. Also if you look at the output when I use 0, the last number is supposed to be the number of bytes read. Or at least I think it is... – Gisli Sep 01 '11 at 12:08
  • Exactly. Seek is returning the offset you have seeked to. So if you seek to 1234, it will return 1234. See "Return Value" in http://msdn.microsoft.com/en-us/library/system.io.stream.seek.aspx – RB. Sep 01 '11 at 12:14
  • @RB - OK that explains a lot. I thought that it would move me to a particular position in the stream. How do I make the read start at that position then? – Gisli Sep 01 '11 at 12:21
  • @Gisli. Not too sure, but I think calling DiscardBufferedData after the seek will force the StreamReader to re-populate the buffer from the seek'ed position, thus giving you what you want. – RB. Sep 01 '11 at 12:25
  • Ah - this article suggests calling it **before** the seek, which makes more sense actually! Try both and let me know which works :) http://msdn.microsoft.com/en-us/library/system.io.streamreader.discardbuffereddata.aspx – RB. Sep 01 '11 at 12:28
  • @RB - Thanks but this doesn't work. I have tried to use the DiscardBufferedData() function both before and after the seek but it makes no difference. – Gisli Sep 01 '11 at 12:29
0

I had a similar problem. The seeked-off position was getting printed on the console. I just had to assign the return value to some variable, and that solved the problem.

So instead of:

$reader.BaseStream.Seek(0, [System.IO.SeekOrigin]::Begin)

I had to write something like:

$pos = $reader.BaseStream.Seek(0, [System.IO.SeekOrigin]::Begin)

Regards, Thejasvi V

0

So I finally found the answer. For some reason unknown to me I have to use a binary reader. Here below is my complete function:

 function check_logs{
 Write-Host "New test `n`n"
 $pos = 19192
 $path = 'C:\Program Files\Microsoft SQL Server\MSSQL10.MSSQLSERVER\MSSQL\Log\ERRORLOG.2'
 $br = 0
 $b = new-object System.IO.BinaryReader([System.IO.File]::Open($path,[System.IO.FileMode]::Open));
 $required = $b.BaseStream.Length - $pos
 $b.BaseStream.Seek($pos, [System.IO.SeekOrigin]::Begin)
 $bytes = $b.ReadBytes($required)
 $log = [System.Text.Encoding]::Unicode.GetString($bytes)
 $split = $log.Split("`n")
 foreach($s in $split)
 {
     if($s.contains("  Error:"))
     {
         Write-Host $s  "`n"
     }
 }
 $b.close
 }

Thanks for the help

Gísli

Gisli
  • 734
  • 2
  • 11
  • 34
  • The problem comes from how text is interpreted. You seem to have gotten lucky because the underlying file data is a single-byte encoding. Your solution won't work on Unicode, and would only work sometimes with UTF-8 files that have non "English" characters in it. – Granger Apr 09 '20 at 20:26