0

I read files into memory and then decode them. At first I created by own class based on TStream and implemented methods to read a Byte, a Cardinal, a Word etc from a stream:

function getWord: Word;
function getCardinal: Cardinal;
function getFloat: Real;
function getNibble: Byte;

It makes it easier to write code that decodes data in files.

But it's painfully slow compared to reading files into an array of Byte an then operating on bytes inline.

I use Delphi 7. There are no inline functions and calling functions is quite slow. So I am thinking how can I make my code easy to understand/write without functions and methods (so it's fast).

My only idea is something like this:

var
  Bytes: array[0..40960-1] of Byte;
  Words: array[0..20480-1] of Word absolute Bytes;
  Cardinals: array[0..10240-1] of Cardinal absolute Bytes;

Is there a way to quickly (and elegantly) read various data types from files?

Tom
  • 2,962
  • 3
  • 39
  • 69
  • 2
    Have you seen [this thread](https://stackoverflow.com/a/5639712/960757)? – TLama Jan 02 '20 at 21:27
  • @TLama That's a good read, thank you, but it's not really an answer to my problem. – Tom Jan 02 '20 at 21:40
  • 2
    Yes it is. A buffered stream is the way to improve the performance here. That linked topic is how I solve this problem in production code. – David Heffernan Jan 02 '20 at 21:51
  • 1
    @Tom: Why not? It allows you to read arbitrary amounts of data (eg, byte, integer, word, a byte buffer, etc.) from a file, with buffered access to make it both quick and elegant. – Ken White Jan 02 '20 at 21:51
  • 1
    If you are thinking about random access and not continuous reading, consider using memory mapped file. But from your question is not clear what you're going to do with that file (read it continuously or access its data at any position?). – TLama Jan 02 '20 at 21:56
  • @TLama Because it's about 5 times slower than working with an array: https://pastebin.com/vAbGyp0n – Tom Jan 02 '20 at 22:09
  • 1
    You should not read so small chunks. Read bigger to get it efficient. That;s what the thread is about. Yet you didn't say whether you're going to read the file randomly or continuously. Which one is that? Continuous? – TLama Jan 02 '20 at 22:13
  • 1
    @TLama No, he's using a buffered stream which reads in chunks. – David Heffernan Jan 02 '20 at 22:15
  • @TLama Continuous only. – Tom Jan 02 '20 at 22:15
  • 1
    @Tom If your real problem is to read a 500MB file and interpret it as a serious of 3 byte integers that you need to sum, then write bespoke code to do that. But I think that when you work with the actual problem, and the file isn't in cache as yours will be, then you will find that your timings aren't right. – David Heffernan Jan 02 '20 at 22:17
  • 1
    @David, true, sorry. I've been confused by the overloaded constructor... – TLama Jan 02 '20 at 22:20
  • @DavidHeffernan You might be right. Thank you for your class. I think I'll switch to it. – Tom Jan 02 '20 at 23:01
  • 2
    Note that your proposed solution does not work for arbitrary reading, you need to watch alignment. E.g.. you can't read 3 bytes and then a dword. – Sertac Akyuz Jan 02 '20 at 23:13

2 Answers2

1

If I have understood correctly you want to read the whole file into memory and access it to make things faster rather than seeking in the file.

Use a TMemoryStream to handle the data in memory. You use TMemoryStream.LoadFromFileto load the whole file into memory from the file. Then call the TMemoryStream.Read method to read the data. Alternatively you can also access the pointer directly using the TMemoryStream.Memory property.

The act of calling the TMemoryStream.Read method will move the position of the pointer in the stream until you reach the end of the stream.

JerryTheGreek
  • 160
  • 2
  • 7
0

Here comes the native languages best part, you can directly address with pointers, like: 0. Get the pointer you read from file

var pData: PByte;
var byteData: Byte;
var pIntData: PInteger;
var intData: Integer;
...
pData = @myArray[0]
  1. Read the data you want

    pIntData= pData; // maybe you will need cast : pIntData = (PInteger)pData; intData = pIntData^; // do what you want with your intData byteData = pData^; // do what you want with your byteData

  2. Move pointer with size of data you read

    Inc(pData, SizeOf(Integer)); // if you read byte then SizeOf(Byte)

3.Do reading and moving pointers until you reach the end of stream/array/size

Check this out: The magic of pointers part of the book

And the official delphi help: Pointer operators in delphi

A personal suggestions from my expreiences, I would use an own class based on winApi: Windows ReadFile API. You can't access many things behing the delphi FileStream, I hated it. Also there was a bug in FileStream when you set the position of the stream.

Cheers

IAdam
  • 27
  • 5
  • 1
    Just a small explanation: When you use '^' operator it is based on the type that it points to, how many bytes will be read from the base address. So PByte^ will read 1byte; PWord^ will read 2 bytes; PInteger will read 4 bytes etc. – IAdam Jan 02 '20 at 21:40
  • It is A solution. Maybe not the best, but not the worst. Thank you! I'll accept your answer if nothing better comes. – Tom Jan 02 '20 at 21:43