2

I'm having a weird issue on using Delphi's TMemoryStream (or TFileStream for that matter). While reading a part of the stream into a byte array. Here's some code as an example.

procedure readfromstream();
var
     ms : TMemoryStream;
     buffer : array of byte;
     recordSize : Integer;
begin
  try
  begin
     ms := TMemeoryStream.Create();
     ms.LoadFromFile(<some_path_to_a_binary_file>);

     while ms.Position < ms.Size do
     begin
         buffer := nil;
         SetLength(buffer, 4);
         ms.ReadBuffer(buffer, 4);
         move(buffer[0], recordSize, 4);

         SetLength(buffer, recordSize);
         ms.Position := ms.Position - 4;           // Because I was having issues trying to read the rest of the record into a specific point in the buffer
         FillChar(buffer, recordSize, ' ');
         ms.ReadBuffer(buffer, recordSize);        // Issue line ???

         // Create the record from the buffer
     end;
  finally
  begin
     ms.Free();
  end;
end;

procedure is called as,

// Some stuff happens before it

readfromstream();

// Some stuff happens after it

on debugging, I can see that it reads the stream into the buffer and the record is stored in memory appropriately. The procedure then exits normally and the debugger steps out of the procedure, but I end up straight back into the procedure and it repeats.

By forcing the procedure to exit prematurely I believe the issue involves the ms.ReadBuffer(buffer, recordSize); but I don't see why it would cause the issue.

This procedure is called only once. My test data has only one entry/data. Any help would be greatly appreciated.

David Heffernan
  • 601,492
  • 42
  • 1,072
  • 1,490
Clayton Johnson
  • 279
  • 4
  • 16
  • Welcome to SO. Did you analyze the call stack? The procedure has to have been called by someone or something. Does your program react to events? – Uwe Allner Aug 18 '15 at 06:32
  • Thanks. I forgot to mention that, yes I did and it only ever appears once in the call stack and that is from the function above it. I could have missed something though, any tips on how to double check it. – Clayton Johnson Aug 18 '15 at 06:42
  • FWIW, this is not your real code (there is no `TMemeoryStream` type, AFAICT). So there may be problems we can not see, because they might have got lost in the manual copying process, or we see problems here that are not in the orginal code, for the same reason. Please use copy and paste to copy your code here. Do not retype it. – Rudy Velthuis Aug 18 '15 at 11:11
  • As @David says, you are copying over the pointer and following bytes. Since this is on the stack, and the stack grows downward, that means you are overwriting a considerable part (413 bytes?) of the stack. This, in turn, means that several local variables, the return address on the stack as well any any parameters and probably a great deal of the calling stack frame(s), return addresses and parameters are overwritten. That kind of heavy stack corruption can cause all kinds of terrible problems, and an infinite loop is only one of them. – Rudy Velthuis Aug 18 '15 at 11:18
  • @RudyVelthuis thanks for the explanation, it's exactly what I thought was happening after David said I was causing a memory corruption. Unfortunately I'm still having an issue with it reading in the recordSize the way David explained, so I'll have to have another look at it later – Clayton Johnson Aug 19 '15 at 02:23
  • @Clayton I think I answered your question though. The fact that you have other bugs is a side issue. – David Heffernan Aug 19 '15 at 05:26

2 Answers2

5
FillChar(buffer, recordSize, ' ');

Here you are overwriting the dynamic array variable, a pointer, rather than writing to the content of the array. That causes a memory corruption. Pretty much anything goes at that point.

The call to FillChar is needless anyway. You are going to read into the entire array anyway. Remove the call to FillChar.

For future reference, to do that call correctly, you write it like this:

FillChar(Pointer(buffer)^, ...);

or

FillChar(buffer[0], ...);

I prefer the former since the latter is subject to range errors when the array length is zero.

And then

ms.ReadBuffer(buffer, recordSize);

makes the exact same mistake, writing to the array variable rather than the array, and thus corrupting memory.

That should be

ms.ReadBuffer(Pointer(buffer)^, recordSize);

or

ms.ReadBuffer(buffer[0], recordSize);

The first 4 lines inside the loop are clumsy. Read directly into the variable:

ms.ReadBuffer(recordSize, SizeOf(recordSize));

I recommend that you perform some sanity checks on the value of recordSize that you read. For instance, any value less than 4 is clearly an error.

There's not a lot of point in moving the stream pointer back and reading again. You can copy recordSize into the first 4 bytes and the array and then read the rest.

Move(recordSize, buffer[0], SizeOf(recordSize));
ms.ReadBuffer(buffer[SizeOf(recordSize)], recordSize - SizeOf(recordSize));

A memory stream also seems wasteful. Why read the entire file into memory? That's going to place stress on your address space for large files. Use a buffered file stream.

Letting the caller allocate the stream would give more flexibility to the caller. They could then read from any type of stream and not be constrained to use a disk file.

Your try/finally block is wrong. You must acquire the resource immediately before the try. As you have it, an exception in the constructor leads to you calling Free on an uninitialized variable.

A better version might be:

procedure ReadFromStream(Stream: TStream);
var
  buffer: TArray<byte>;
  recordSize: Integer;
begin
  while Stream.Position < Stream.Size do
  begin
    Stream.ReadBuffer(recordSize, SizeOf(recordSize));     
    if recordSize < SizeOf(recordSize) then
      raise ...;

    SetLength(buffer, recordSize);
    Move(recordSize, buffer[0], SizeOf(recordSize));
    if recordSize > SizeOf(recordSize) then
      Stream.ReadBuffer(buffer[SizeOf(recordSize)],
        recordSize - SizeOf(recordSize));

    // process record
  end;
end;       
Community
  • 1
  • 1
David Heffernan
  • 601,492
  • 42
  • 1,072
  • 1,490
  • Thanks for the information I'll keep that in mind. However sadly that did not solve my issue, the procedure still gets called, still does everything correctly and then for some reason ends up right back in at the top of the procedure – Clayton Johnson Aug 18 '15 at 06:45
  • 1
    You have two memory corruptions. Refresh the page and read the answer again. – David Heffernan Aug 18 '15 at 06:50
  • Thank you @David for your help. I'm sure what you have provided me is correct, I just have one issue. When the recordSize is written to the file it is, in this case 413, when I was reading it in before, the long way, it was reading it as 413. When I read it directly into variable using `Stream.ReadBuffer(recordSize, SizeOf(recordSize));` it ends up being a number over 19 million. Any ideas why this would be the case? – Clayton Johnson Aug 18 '15 at 07:36
  • 1
    There's presumably some other mistake in your new code. – David Heffernan Aug 18 '15 at 07:43
  • So I finally got some time to go back and take a look at what I was doing wrong. When I was writing the records to file I was using `TFileStream.Write` so by changing that to use `TFileStream.WriteBuffer` it all worked perfectly fine. Thank you @David for your help – Clayton Johnson Aug 27 '15 at 03:27
0

Sorry I can't add a comment, being a newb and all :) This reply is based on my understanding of Clayton's code in light of his comment with the recordSize values.

The reason David's code is looping is probably that you are interpreting each four byte "block" is a number. I'll assume your first Stream.Readbuffer is correct and that the first four bytes in the file is a length.

Now, unless I'm mistaken, I expect the recordSize will usually be greater than SizeOf(recordSize), which I think should be 4 (the size of an int). Nevertheless, this line is meaningless here.

The SetLength is correct, given my previous assumption.

Now your Move is where the story hits a snag: you haven't read anything since you read the length! So before the move, you should have: bytesRead := Stream.Readbuffer(Pointer(buffer)^, recordSize);

Now you can check for EOF: if bytesRead <> recordSize then raise...;

...and move the buffer somewhere (if you wish): Move(buffer[0], dest[0], recordSize);

And you are positioned to read the next recordSize value.

Repeat until EOF.

Pat Heuvel
  • 96
  • 4
  • Just realised: that line that I said is meaningless, isn't! It's just not correct: the Read should return the bytes read into bytesRead and _this_ should be compared to SizeOf(recordSize). – Pat Heuvel Aug 18 '15 at 12:04
  • The reason it is looping is that he is reading a few hundred bytes (he said: 413 bytes) onto the stack, overwriting the local variable buffer, and probably the local stack frame, return address, etc. He does not write to the memory to which buffer points, he writes over the pointer called buffer itself, and over a lot more of the stack. – Rudy Velthuis Aug 18 '15 at 16:44
  • @Pat this is all wrong. Your analysis is wrong. The code you show does not compile. ReadBuffer is a procedure. It raises if it does not read the number of bytes requested. The problem is exactly as per my answer. – David Heffernan Aug 18 '15 at 20:47
  • Strewth! Why did I read it as "Stream.Read"?. Bad week to give up chewing gum. Thanks for your response! – Pat Heuvel Aug 18 '15 at 23:43
  • @David I think the problem Clayton is seeing with your code is that you assume that the length field is part of the record. Your code works as I expect when the second ReadBuffer reads recordSize bytes instead of (recordSize - SizeOf(recordSize)) bytes. – Pat Heuvel Aug 19 '15 at 01:14
  • @Pat I've not assumed anything. I've merely taken the code in the Q as a spec. – David Heffernan Aug 19 '15 at 05:25
  • Sorry - no offense intended. – Pat Heuvel Aug 19 '15 at 05:44