3

How do I read file using certain lines TFileStream. I read lines which has millions of files. So I want to play in memory that I will only use

Example:

Line 1: 00 00 00 00 00 00 00 00
Line 2: 00 00 00 00 00 00 00 00
Line 3: 00 00 00 00 00 00 00 00
Line 4: 00 00 00 00 00 00 00 00
Line 5: 00 00 00 00 00 00 00 00

I read the line 2 to 4

I used a function TextFile, but it seems slow. Just found a function that reads the last line in TFileStream.

tenorsax
  • 21,123
  • 9
  • 60
  • 107
André
  • 93
  • 1
  • 1
  • 9
  • 2
    possible duplicate of [read streams line by line](http://stackoverflow.com/questions/6942704/read-streams-line-by-line) – ain Jul 22 '12 at 08:46

3 Answers3

12

You can open a file for reading with the TFileStream class like so ...

FileStream := TFileStream.Create( 'MyBigTextFile.txt', fmOpenRead)

TFileStream is not a reference counted object, so be sure and release it when you are done, like so ...

FileStream.Free

From here-on in, I will assume that your file's character encoding is UTF-8 and that the end-of-line termination is MS style. If not, please adjust accordingly, or update your question.

You can read a single code unit of a UTF-8 character (not the same thing as reading a single character) like so:

var ch: ansichar;
FileStream.ReadBuffer( ch, 1);

You can read a line of text like so ...

function ReadLine( var Stream: TStream; var Line: string): boolean;
var
  RawLine: UTF8String;
  ch: AnsiChar;
begin
result := False;
ch := #0;
while (Stream.Read( ch, 1) = 1) and (ch <> #13) do
  begin
  result := True;
  RawLine := RawLine + ch
  end;
Line := RawLine;
if ch = #13 then
  begin
  result := True;
  if (Stream.Read( ch, 1) = 1) and (ch <> #10) then
    Stream.Seek(-1, soCurrent) // unread it if not LF character.
  end
end;

To read lines 2, 3 and 4, assuming position is at 0 ...

ReadLine( Stream, Line1);
ReadLine( Stream, Line2);
ReadLine( Stream, Line3);
ReadLine( Stream, Line4);
David Heffernan
  • 601,492
  • 42
  • 1,072
  • 1,490
Sean B. Durkin
  • 12,659
  • 1
  • 36
  • 65
  • 8
    That's going to be painfully slow for a large file. Calling ReadFile 1 byte at a time hurts. – David Heffernan Jul 22 '12 at 08:42
  • 1
    Actually yes. The O/S buffers, but there is a large overhead in calling ReadFile. One of my answers here covers that in detail. – David Heffernan Jul 22 '12 at 08:49
  • Whether it is efficient or not depends a lot on context and your expectations. It would be easy for the OP to adjust and read in chunks. But the above solution is all the OP needs to get the basic idea. There are so many countless variations to improve efficiency. But these are all context sensitive and this kind of optimization is best left to the OP. – Sean B. Durkin Jul 22 '12 at 08:49
  • Well, optimization will be needed if a lot of lines are to be read. The answer I refer to is here: http://stackoverflow.com/questions/5639531/buffered-files-for-faster-disk-access/5639712#5639712 – David Heffernan Jul 22 '12 at 08:55
  • Thanks for the reply. In delphi 7, is giving error code Stream.Seek (soCurrent, -1) There is the overloaded version of 'Seek' that Can Be called with These arguments The format is UTF8 – André Jul 22 '12 at 15:19
  • just reverse the parameters and you will be fine: `Stream.Seek(-1, soCurrent)`. You really ought to be able to work that stuff out for yourself. Don't be afraid of reading the documentation. That's how I worked it out. – David Heffernan Jul 22 '12 at 15:49
  • Andre, your example suggests that your lines have a fixed size. If that is the case you could calculate line offsets in the stream. That would be really fast. And if the number of lines you need to read is small in relation to the filesize, don't read the entire file, but use BlockRead() etc to only read those parts from disk. – Jan Doggen Jul 23 '12 at 07:45
  • Thank you! I am studying the implementation of suggested resources. All lines will be read are of fixed sizes – André Jul 23 '12 at 12:27
2

You may make use of traditional file operations. To be real fast you have to be sure that each line has the same amount of bytes in it.

Blockread, BlockWrite, Seek are the keywords you may look at.

Sample page for BlockRead

Sample page for Seek

Ali Avcı
  • 870
  • 5
  • 8
  • +1 to counteract the downvote. Given Andre's reponses on my comment "Are your lines maybe fixed size?" the BlockRead()/Seek() option is viable. – Jan Doggen Jul 23 '12 at 14:12
0

Code Sean propose is slow because of the TFileStream.Read as David explained. But in case you use TMemoryStream instead of TFileStream, the slow Stream.Read is not so important. In such a case the string operations take most of the time.

If you slightly change the code the speed is cca 2 x higher:

function ReadLine(Stream: TStream; var Line: string): boolean;
var
  ch: AnsiChar;
  StartPos, LineLen: integer;
begin
  result := False;
  StartPos := Stream.Position;
  ch := #0;
  while (Stream.Read( ch, 1) = 1) and (ch <> #13) do;
  LineLen := Stream.Position - StartPos;
  Stream.Position := StartPos;
  SetString(Line, NIL, LineLen);
  Stream.ReadBuffer(Line[1], LineLen);
  if ch = #13 then
    begin
    result := True;
    if (Stream.Read( ch, 1) = 1) and (ch <> #10) then
      Stream.Seek(-1, soCurrent) // unread it if not LF character.
    end
end;
nfc1
  • 190
  • 1
  • 1
  • 11