10

Perl has lots of handy idioms for doing common things easily, including:

  1. The file-reading operator <HANDLE>, if used without a filehandle, will seamlessly open and read all files in @ARGV, one line at a time:

    while (<>) { print $., $_;  }
    
  2. By (locally) resetting the input record separator $/, I can "slurp" a whole file at once:

    local $/; $content = <HANDLE>;
    

But these two idioms don't work quite work together as I expected: After I reset $/, slurping with <> stops at end of file: The following gives me the contents of the first file only.

local $/; $content = <>;

Now, I trust that this behavior must be by design, but I don't really care. What's the concise Perl idiom for fetching all content of all files into a string? (This related question mentions a way to slurp a list of files into separate strings, which is a start... but not terribly elegant if you still need to join the strings afterwards.)

Community
  • 1
  • 1
alexis
  • 48,685
  • 16
  • 101
  • 161
  • 2
    `my @content = do { local $/; <> };`if you don't mind fetching files into an array (no join or string concat needed). – mpapec Jul 04 '16 at 12:28
  • Possible duplicate of *[Fancy file slurping in Perl](http://stackoverflow.com/questions/30062413/fancy-file-slurping-in-perl)*. – Peter Mortensen Jul 04 '16 at 19:22
  • What did you do to ensure this question is not a duplicate? – Peter Mortensen Jul 04 '16 at 19:23
  • @PeterMortensen, it is not a duplicate. [That question](http://stackoverflow.com/questions/30062413/fancy-file-slurping-in-perl) slurps each file into a separate string, which would then need to be joined together. That'll have to do if there's no better solution (it's effectively what I do in my self-answer), but I can't know that without asking _this_ question. – alexis Jul 04 '16 at 20:06
  • @Сухой27, I'd rather not fetch each file into a different string if I can avoid it. They'll need to be joined or otherwise post-processed. – alexis Jul 04 '16 at 20:08

3 Answers3

10

You realise that you are only hiding multiple open calls, right? There's no magic way to read multiple files without opening all of them one by one

As you have found, Perl will stop reading at the end of each file if $/ is set to undef

The internal mechanism that handles <> removes each file name from @ARGV as it is opened, so I suggest that you simply use a while loop that reads using <> until @ARGV is empty

It would look like this

use strict;
use warnings 'all';

my $data;
{
    local $/;
    $data .= <> while @ARGV;
}
Borodin
  • 126,100
  • 9
  • 70
  • 144
  • I am just wandering . how much data a local variable can hold in perl ? Is there any memory limit? – Arijit Panda Jul 04 '16 at 11:39
  • @Arijit: A single scalar may hold as much memory as the process can allocate, with some overhead for the Perl scalar data structure and, obviously, the perl interpreter, compiled program, and any other data. A 2GB string is perfectly feasible. Clearly a 64-bit built of perl will be able to address much more memory than a 32-bit build – Borodin Jul 04 '16 at 11:50
  • Thanks @borodin - What will happen if the file size more than a processor can handle. Is it stop or skip the extra memory? – Arijit Panda Jul 04 '16 at 12:01
  • What is this really about? I think you need to ask a proper question if you're interested. If Perl cannot allocate the memory that it needs then it will die with a non-zero exit code and the message ***"Out of memory!"***. But you really should be reading files in smaller units, usually a single line. If you think you need an entire file in memory then your design is probably wrong – Borodin Jul 04 '16 at 12:08
  • @Borodin, yes I realize that multiple `open`s are involved.. That's what I meant by "seamlessly": It happens, but I don't have to hold its hand or even look at it. – alexis Jul 04 '16 at 12:41
8

Since there doesn't seem to be a way to slurp everything at once, one compact solution would be to place <> in list context. The implicit repeated calls will fetch everything.

local $/;
$data = join "", <>;
alexis
  • 48,685
  • 16
  • 101
  • 161
  • 1
    This will hold all of the files' contents in a list until they have all been read before copying them to `$data` and discarding the individual strings, so there is little point in changing `$/` in the first place. This is fine if there are only a few files, but if there are many (for instance, by providing a wildcard pattern as the command-line parameter) then it is wasteful. – Borodin Jul 04 '16 at 11:26
  • Perl doesn't have iterator behavior in a list context? I would have expected `<>` to be pumped internally by `join`, which would make this at worst equivalent to the explicit loop in your answer (and possibly better since the strings are joined internally.) – alexis Jul 04 '16 at 11:54
  • There is no hidden iterator behaviour in Perl, and just because Python (sometimes) does it, it doesn't make it the *right way*. It would be quite possible to write an iterative `readline` and `join` that perform the way you describe, but the result would be more arcane, and slower than, repeated uses of `.=`. The Perl philosophy is very much about transparency, so only things like the sort algorithm are black-boxed. I learned everything I know about object-oriented programming from Perl, precisely because it is all self-assembly, and it is my first choice for a teaching language. – Borodin Jul 04 '16 at 12:03
  • Interesting. I guess it depends on who your students are. My students (humanities) need to know how to drive the car, not how to take it apart and put it back together. – alexis Jul 04 '16 at 12:07
  • I think an "iterative join" on its own would be too narrow. If the meaning of "in a list context" is to construct the whole list before passing it as an argument, then that's just how it is. I was indeed extrapolating python concepts (which are "transparent" in a very different way). – alexis Jul 04 '16 at 12:10
  • Clearly there are many levels to that, and my secretary, who is very clearly a "programmer" because she deals with the complexities of MS Word on a daily basis, has no need to know about object-oriented programming. However I am certain that my driving is improved by knowing what happens in the engine when I press pedals, and my programming is improved by assembler experience, and knowing what I am implying when I call `strcpy`. Of course we cannot all keep tabs on every consequence of anything we do, but the foundation is essential. – Borodin Jul 04 '16 at 12:28
  • The behaviour of any Perl function in list context is entirely down to that function. A function may make only one return call, and there is no built-in iterator support. However there is no reason why I shouldn't write a class with an `__iter__` method and a `next` method that behave just as Python's hidden ones do. – Borodin Jul 04 '16 at 12:28
  • No question there: The expert benefits from knowing every bit of the system's foundation. I like to know what's under the hood, too. But students... that's a different matter. I wouldn't teach high-level concepts from the bolts up, it defeats the whole point of abstraction. – alexis Jul 04 '16 at 12:39
-3
use File::Slurp qw/read_file/;
my $text = read_file( 'filename' ) ;
  • 2
    How does that help read multiple files specified as commandline arguments? `File::Slurp` is good to know about, but it doesn't answer the question. – alexis Jul 04 '16 at 18:38