read
and sysread
have very similar documentation. What are the differences between the two?
Asked
Active
Viewed 3,533 times
21

ikegami
- 367,544
- 15
- 269
- 518
-
grrr :-) that's a good question – G. Cito Mar 30 '16 at 17:01
-
4Credit to @G. Cito for prompting this question. – ikegami Mar 30 '16 at 17:02
1 Answers
27
About read
:
read
supports PerlIO layers.read
works with any Perl file handle[1].read
buffers.read
obtains data from the system in fixed sized blocks of 8 KiB[2].read
may block if less data than requested is available[3].
About sysread
:
sysread
doesn't support PerlIO layers (meaning it requires a raw a.k.a. binary handle).sysread
only works with Perl file handles that map to a system file handle/descriptor[4].sysread
doesn't buffer.sysread
performs a single system call.sysread
returns immediately if data is available to be returned, even if the amount of data is less than the amount requested.
Summary and conclusions:
read
works with any Perl file handle, whilesysread
is limited to Perl file handles mapped to a system file handle/descriptor.read
isn't compatible withselect
[5], whilesysread
is compatible withselect
.read
can perform decoding for you, whilesysread
requires that you do your own decoding.read
should be faster for very small reads, whilesysread
should be faster for very large reads.
Notes:
These include, for example, tied file handles and those created using
open(my $fh, '<', \$var)
.Before 5.14, Perl read in 4 KiB blocks. Since 5.14, the size of the blocks is configurable when you build
perl
, with a default of 8 KiB.In my experience,
read
will return exactly the amount requested (if possible) when reading from a plain file, but may return less when reading from a pipe. These results are by no means guaranteed.fileno
returns a non-negative number for these. These include, for example, handles that read from plain files, from pipes and from sockets, but not those mentioned in [1].I'm referring to the 4-argument one called by IO::Select.

ikegami
- 367,544
- 15
- 269
- 518
-
1Great summary. - should be in perlfunc. This: "`read` should be faster for small reads, while `sysread` should be faster for large reads." is exactly what is needed. Of course, given the infinite possibilities of the real word, it may not **always** be true but a mostly truthy perlish guideline is what I want. – G. Cito Mar 30 '16 at 17:10
-
1In a response to [another question](http://stackoverflow.com/a/36208336/2019415) I used [`Stream::Reader`](https://metacpan.org/pod/Stream::Reader). As an experiment I replaced `read` with `sysread` in `Reader.pm` and gained 9-10% throughput - it seemed too easy. Besides the obvious bits (buffering, encoding,) is it just a question of benchmarking and testing? Can you speak to any data integrity, failover/robustness elements of this? – G. Cito Mar 30 '16 at 17:19
-
1@G.Cito in reusable code such as Stream::Reader, you have to assume filehandles may have layers, so sysread is not an option. – ysth Mar 30 '16 at 20:10
-
1@G. Cito, Talk of "UTF-8 mode" implies it's not just a possibility that they have layers, but that it's a supported mode of operation. That prevents `sysread` from being a valid option. – ikegami Mar 30 '16 at 20:34
-
3Also, you can `read` from things that aren't actually files (perhaps you opened a filehandle to a scalarref or `TIEHANDLE`d something), but you can only `sysread` something with a positive `fileno()`. – hobbs Mar 30 '16 at 20:51
-
-
@cuonglm, They're not even similar. I think you mean `print` and `syswrite`. Of the two, I've only ever used `print` because it's easier to use. I don't know if there are any other differences. – ikegami Apr 04 '16 at 15:52