I'm now writing a program which generates a file. I wondered what are the best practices on the Stream(s) especially when it comes to size? I can imagine that if a stream gets too large it can bring some slowdowns or other performance issues.
I have the following code, which could be called many many times, alse the collection can be huge. I presume one should behave differently for different sizes like <1MB <=> 10MB <=> 100MB <=> to 1-10GB <=> >10GB
writeIntoStream: anInputStringCollection
aWriteStream := WriteStream on: '' asUnicode16String.
anInputStringCollection do: [ :string |
aWriteStream nextPutAllUnicode: string asUnicode16String.
].
^ aWriteStream
What are the best practices? For example, should one care if it fits to a heap or a stack?
For now I've concluded that if I use a maximum of 5kB for a stream (or collection) it is fast enough and it works (for Smalltalk/X).
I would like to know the limits and the internals for different Smalltalk flavours. (I did not perform any test and could not find any articles about it)
Edit: First thank you everyone (@LeandroCaniglia, @JayK, @aka.nice). The very first version was - the slowdowns were caused by way to many operations: open, write, close. Writen line by line:
write: newString to: aFile
"Writes keyName, keyValue to a file"
"/ aFile is UTF16-LE (Little Endian) Without Signature (BOM)
aFile appendingFileDo: [ :stream |
stream nextPutAllUtf16Bytes: newString MSB: false
]
The second version, way faster but still not correct. There was an intermediary stream which was written in chunks was:
write: aWriteStream to: aFile
"Writes everything written to the stream"
"/ aFile is UTF16-LE Without Signature
aFile appendingFileDo: [ :stream | "/ withoutTrailingSeparators must be there as Stream puts spaces at the end
stream nextPutAllUtf16Bytes: (aWriteStream contents withoutTrailingSeparators) MSB: false
]
The third version after Leandro's anwer and you advice (I looked at the buffer - size is defined as __stringSize(aCollection)
when available buffer/memory is exhausted, then it is written into file. I have removed #write:to:
all together and now the stream is defined as:
anAppendFileStream := aFile appendingWriteStream.
Every method that takes play in the stream now uses:
anAppendFileStream nextPutUtf16Bytes: aCharacter MSB: false.
or
anAppendFileStream nextPutAllUtf16Bytes: string MSB: false
As for the buffer size itself:
There are buffer size logic where guessing of the buffer length takes places e.g.#nextPutAll:
- bufLen = (sepLen == 1) ? len : (len + ((len/4) + 1) * sepLen);)
, where sepLen
is defined based on separator size (EOF, cr, crlf).
There cen be different buffer sizes for different methods e.g. #copyToEndFrom:
- for windows: bufferSize := 1 * 1024
or *nix bufferSize := 8 * 1024
[kB].