What is the difference between `read` and `sysread`?

Question

read and sysread have very similar documentation. What are the differences between the two?

grrr :-) that's a good question – G. Cito Mar 30 '16 at 17:01 — G. Cito, Mar 30 '16 at 17:01
Credit to @G. Cito for prompting this question. – ikegami Mar 30 '16 at 17:02 — ikegami, Mar 30 '16 at 17:02

ikegami · Accepted Answer · 2019-02-23T14:25:53.733

27

About read:

read supports PerlIO layers.
read works with any Perl file handle^[1].
read buffers.
read obtains data from the system in fixed sized blocks of 8 KiB^[2].
read may block if less data than requested is available^[3].

About sysread:

sysread doesn't support PerlIO layers (meaning it requires a raw a.k.a. binary handle).
sysread only works with Perl file handles that map to a system file handle/descriptor^[4].
sysread doesn't buffer.
sysread performs a single system call.
sysread returns immediately if data is available to be returned, even if the amount of data is less than the amount requested.

Summary and conclusions:

read works with any Perl file handle, while sysread is limited to Perl file handles mapped to a system file handle/descriptor.
read isn't compatible with select^[5], while sysread is compatible with select.
read can perform decoding for you, while sysread requires that you do your own decoding.
read should be faster for very small reads, while sysread should be faster for very large reads.

Notes:

These include, for example, tied file handles and those created using open(my $fh, '<', \$var).
Before 5.14, Perl read in 4 KiB blocks. Since 5.14, the size of the blocks is configurable when you build perl, with a default of 8 KiB.
In my experience, read will return exactly the amount requested (if possible) when reading from a plain file, but may return less when reading from a pipe. These results are by no means guaranteed.
fileno returns a non-negative number for these. These include, for example, handles that read from plain files, from pipes and from sockets, but not those mentioned in [1].
I'm referring to the 4-argument one called by IO::Select.

edited Feb 23 '19 at 14:25

answered Mar 30 '16 at 16:55

ikegami

367,544
15
269
518

1

Great summary. - should be in perlfunc. This: "`read` should be faster for small reads, while `sysread` should be faster for large reads." is exactly what is needed. Of course, given the infinite possibilities of the real word, it may not **always** be true but a mostly truthy perlish guideline is what I want. – G. Cito Mar 30 '16 at 17:10
1

In a response to [another question](http://stackoverflow.com/a/36208336/2019415) I used [`Stream::Reader`](https://metacpan.org/pod/Stream::Reader). As an experiment I replaced `read` with `sysread` in `Reader.pm` and gained 9-10% throughput - it seemed too easy. Besides the obvious bits (buffering, encoding,) is it just a question of benchmarking and testing? Can you speak to any data integrity, failover/robustness elements of this? – G. Cito Mar 30 '16 at 17:19
1

@G.Cito in reusable code such as Stream::Reader, you have to assume filehandles may have layers, so sysread is not an option. – ysth Mar 30 '16 at 20:10
1

@G. Cito, Talk of "UTF-8 mode" implies it's not just a possibility that they have layers, but that it's a supported mode of operation. That prevents `sysread` from being a valid option. – ikegami Mar 30 '16 at 20:34
3

Also, you can `read` from things that aren't actually files (perhaps you opened a filehandle to a scalarref or `TIEHANDLE`d something), but you can only `sysread` something with a positive `fileno()`. – hobbs Mar 30 '16 at 20:51
@ikegami: How about `write` and `syswrite`? – cuonglm Apr 01 '16 at 07:25
@cuonglm, They're not even similar. I think you mean `print` and `syswrite`. Of the two, I've only ever used `print` because it's easier to use. I don't know if there are any other differences. – ikegami Apr 04 '16 at 15:52

What is the difference between `read` and `sysread`?

1 Answers1

Linked

Related