What is the best way to slurp a file into a string in Perl?

Question

Yes, There's More Than One Way To Do It but there must be a canonical or most efficient or most concise way. I'll add answers I know of and see what percolates to the top.

To be clear, the question is how best to read the contents of a file into a string. One solution per answer.

Leon Timmermans · Accepted Answer · 2016-06-13T12:16:48.687

76

How about this:

use File::Slurp;
my $text = read_file($filename);

ETA: note Bug #83126 for File-Slurp: Security hole with encoding(UTF-8). I now recommend using File::Slurper (disclaimer: I wrote it), also because it has better defaults around encodings:

use File::Slurper 'read_text';
my $text = read_text($filename);

or Path::Tiny:

use Path::Tiny;
path($filename)->slurp_utf8;

edited Jun 13 '16 at 12:16

answered Oct 15 '08 at 21:59

Leon Timmermans

30,029
2
61
110

What if you don't want this to die if the file doesn't exist? – dreeves Jul 01 '09 at 22:09
4

The easiest way to prevent that from being likely is that simply first checking if the file exists... – Leon Timmermans Jul 13 '09 at 22:08
1

this does have the disadvantage that it is not included in out-of-the-box perl. at least not my ActiveState perl for windows (v5.10.0). – Kip Apr 14 '10 at 15:12
3

Note that File::Slurp has recently been discovered to be a huge security problem: https://rt.cpan.org/Ticket/Display.html?id=83126 – brian d foy Feb 19 '14 at 06:41
Hi, I got `Undefined subroutine &main::read_text`. It should be `use File::Slurper 'read_text';`. https://metacpan.org/pod/File::Slurper – stenlytw Jun 12 '16 at 12:13
File::Slurp isn't a portable solution. On Windows, it will barf on all non-ANSI filenames. I haven't tested Path::Tiny, but given Perl's general level of contempt for Windows, I'll bet that it's the same. It may be possible to open a file with Win32::LongPath and pass the filehandle instead of the filepath. (That's something that Perl really should support in all file I/O,) – Freon Sandoz Feb 02 '22 at 02:28

score 48 · Answer 2 · edited Feb 26 '17 at 17:05

48

I like doing this with a do block in which I localize @ARGV so I can use the diamond operator to do the file magic for me.

 my $contents = do { local(@ARGV, $/) = $file; <> };

If you need this to be a bit more robust, you can easily turn this into a subroutine.

If you need something really robust that handles all sorts of special cases, use File::Slurp. Even if you aren't going to use it, take a look at the source to see all the wacky situations it has to handle. File::Slurp has a big security problem that doesn't look to have a solution. Part of this is its failure to properly handle encodings. Even my quick answer has that problem. If you need to handle the encoding (maybe because you don't make everything UTF-8 by default), this expands to:

my $contents = do {
    open my $fh, '<:encoding(UTF-8)', $file or die '...';
    local $/;
    <$fh>;
    };

If you don't need to change the file, you might be able to use File::Map.

edited Feb 26 '17 at 17:05

answered Oct 15 '08 at 22:30

brian d foy

129,424
31
207
592

8

I'm lazy and write `my $contents = do {local (@ARGV,$/) = $file; <>};`, which is the exact same thing in less characters :) – ephemient Oct 16 '08 at 19:27
I'm wondering why local @ARGV = $file; <> would be any different than <$file>. – Powerlord Nov 21 '08 at 14:08
@Bemrose: because $file is not a filehandle. – brian d foy Nov 22 '08 at 11:12
1

I got shot in the foot adding this method to a file that further down was already using `<>`, expecting it to read from `STDIN`. The behaviour of `<>` differs from the first call to subsequent calls, and since I changed the first call, I altered the behaviour of the existing call too (which was expecting the `` behaviour of `<>`). – Adam Millerchip Nov 06 '15 at 03:44

score 35 · Answer 3 · answered Oct 18 '08 at 08:35

35

In writing File::Slurp (which is the best way), Uri Guttman did a lot of research in the many ways of slurping and which is most efficient. He wrote down his findings here and incorporated them info File::Slurp.

answered Oct 18 '08 at 08:35

Schwern

153,029
25
195
336

4

Note that File::Slurp has recently been discovered to be a huge security problem: https://rt.cpan.org/Ticket/Display.html?id=83126 – brian d foy Feb 19 '14 at 06:41

score 24 · Answer 4 · edited Mar 30 '10 at 16:35

24

open(my $f, '<', $filename) or die "OPENING $filename: $!\n";
$string = do { local($/); <$f> };
close($f);

edited Mar 30 '10 at 16:35

Robert P

15,707
10
68
112

answered Oct 15 '08 at 21:59

dreeves

26,430
45
154
229

score 11 · Answer 5 · answered Oct 15 '08 at 22:38

Things to think about (especially when compared with other solutions):

Lexical filehandles
Reduce scope
Reduce magic

So I get:

my $contents = do {
  local $/;
  open my $fh, $filename or die "Can't open $filename: $!";
  <$fh>
};

I'm not a big fan of magic <> except when actually using magic <>. Instead of faking it out, why not just use the open call directly? It's not much more work, and is explicit. (True magic <>, especially when handling "-", is far more work to perfectly emulate, but we aren't using it here anyway.)

And in case it's not obvious to those following along at home, at the end of the curly block, $fh goes out of scope and the file handle is closed automatically. — dland, Oct 16 '08 at 15:43

dwarring · Answer 6 · 2018-06-18T03:37:10.483

mmap (Memory mapping) of strings may be useful when you:

Have very large strings, that you don't want to load into memory
Want a blindly fast initialisation (you get gradual I/O on access)
Have random or lazy access to the string.
May want to update the string, but are only extending it or replacing characters:

#!/usr/bin/perl
use warnings; use strict;

use IO::File;
use Sys::Mmap;

sub sip {

    my $file_name = shift;
    my $fh;

    open ($fh, '+<', $file_name)
        or die "Unable to open $file_name: $!";

    my $str;

    mmap($str, 0, PROT_READ|PROT_WRITE, MAP_SHARED, $fh)
      or die "mmap failed: $!";

    return $str;
}

my $str = sip('/tmp/words');

print substr($str, 100,20);

Update: May 2012

The following should be pretty well equivalent, after replacing Sys::Mmap with File::Map

#!/usr/bin/perl
use warnings; use strict;

use File::Map qw{map_file};

map_file(my $str => '/tmp/words', '+<');

print substr($str, 100, 20);

Actually, File::Map (disclaimer: written by me) would be a better choice nowadays. It's far more portable (it works on both Unix and Windows), but also easier to use («map_file my $str, $file_name;»). — Leon Timmermans, May 03 '12 at 09:53

score 8 · Answer 7 · answered Oct 16 '08 at 06:44

8

This is neither fast, nor platform independent, and really evil, but it's short (and I've seen this in Larry Wall's code ;-):

 my $contents = `cat $file`;

Kids, don't do that at home ;-).

answered Oct 16 '08 at 06:44

moritz

12,710
1
41
63

score 8 · Answer 8 · answered Dec 08 '08 at 03:30

8

use Path::Class;
file('/some/path')->slurp;

answered Dec 08 '08 at 03:30

score 7 · Answer 9 · answered Oct 15 '08 at 21:59

7

{
  open F, $filename or die "Can't read $filename: $!";
  local $/;  # enable slurp mode, locally.
  $file = <F>;
  close F;
}

answered Oct 15 '08 at 21:59

zigdon

14,573
6
35
54

score 6 · Answer 10 · answered Oct 28 '14 at 12:36

6

For one-liners you can usually use the -0 switch (with -n) to make perl read the whole file at once (if the file doesn't contain any null bytes):

perl -n0e 'print "content is in $_\n"' filename

If it's a binary file, you could use -0777:

perl -n0777e 'print length' filename

answered Oct 28 '14 at 12:36

Qtax

33,241
9
83
121

Makes for a nice way to check that an attempted line substitution in a file actually happens: perl -p -i -0 -e 's/^old_line/new_line/m or (print and die)' some_file, or probably could use /mg to do all matching lines if many expected. – Britton Kerin Apr 30 '19 at 14:08

score 6 · Answer 11 · answered Aug 10 '11 at 20:09

6

use IO::All;

# read into a string (scalar context)
$contents = io($filename)->slurp;

# read all lines an array (array context)
@lines = io($filename)->slurp;

answered Aug 10 '11 at 20:09

Prakash K

3,020
1
20
11

score 4 · Answer 12 · edited Apr 12 '12 at 06:52

4

See the summary of Perl6::Slurp which is incredibly flexible and generally does the right thing with very little effort.

edited Apr 12 '12 at 06:52

brian d foy

129,424
31
207
592

answered Oct 15 '08 at 22:19

mopoke

10,555
1
31
31

score 3 · Answer 13 · edited Oct 29 '12 at 17:25

3

Nobody said anything about read or sysread, so here is a simple and fast way:

my $string;
{
    open my $fh, '<', $file or die "Can't open $file: $!";
    read $fh, $string, -s $file;   # or sysread
    close $fh;
}

edited Oct 29 '12 at 17:25

Jens

69,818
15
125
179

answered Apr 11 '12 at 13:05

Trizen

1
1

score 3 · Answer 14 · answered Nov 18 '09 at 05:33

3

Here is a nice comparison of the most popular ways to do it:

http://poundcomment.wordpress.com/2009/08/02/perl-read-entire-file/

answered Nov 18 '09 at 05:33

dreeves

26,430
45
154
229

Link only answer as discouraged. Copy the code into your answer – Gilles Quénot Jan 22 '23 at 12:38

dreeves · Answer 15 · 2008-10-16T16:32:22.410

1

Candidate for the worst way to do it! (See comment.)

open(F, $filename) or die "OPENING $filename: $!\n";
@lines = <F>;
close(F);
$string = join('', @lines);

edited Oct 16 '08 at 16:32

answered Oct 15 '08 at 22:01

dreeves

26,430
45
154
229

2

This is probably the most inefficient way I can think of, especially for large files. Now you have two copies of the same data and you have processed it twice just to load it into a scalar. – Robert Gamble Oct 15 '08 at 22:37
It's all situational. For a small file or a run-only-once quickie script, where "$string=`cat $filename`" is not available, this is perfectly reasonable. Inefficient yes! But that's not necessarily the only consideration. – Mr.Ree Nov 19 '08 at 03:56
1

This answer doesn't deserve a negative rating. Bunch of script kiddies that don't understand or care about what perl means by . It's an array silly. No worse performance than some of the other answers on this page. Very informative on how to think about Perl filehandles and slurping, as an array. – unixman83 Mar 29 '12 at 07:35

score 1 · Answer 16 · answered May 28 '15 at 22:35

1

Adjust the special record separator variable $/

undef $/;
open FH, '<', $filename or die "$!\n";
my $contents = <FH>;
close FH;

answered May 28 '15 at 22:35

user4951120

1
1

Curtis Yallop · Answer 17 · 2023-08-23T21:56:52.833

0

open(IN, "<$filename");
$contents = join('', <IN>);
close(IN);

Details:

<IN> is a file descriptor returns a list (aka array) of lines if assigned to an list variable/context.

join takes a delimiter and a list of lines and returns a string with all the lines joined together. Source: https://perldoc.perl.org/functions/join).

open with a filename prefix of "<" opens the file in read-mode.

I use the join construct often for slurping one-liners eg perl -e '$_=join("",<>);s/multiline_regex/replacement_string/gms;print'. m/s options support multiline regexs, see https://perldoc.perl.org/perlre.

edited Aug 23 '23 at 21:56

answered Jul 18 '23 at 22:02

Curtis Yallop

6,696
3
46
36

2

Answer needs supporting information Your answer could be improved with additional supporting information. Please [edit] to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers [in the help center](https://stackoverflow.com/help/how-to-answer). – moken Jul 20 '23 at 08:15

What is the best way to slurp a file into a string in Perl?

17 Answers17

Linked

Related