How Do I Combine Two Files in Perl?

Question

I want to read in two input files and output a new file that contains one line that is a concatenation of each corresponding line from the two input files.

For instance:

line 1 of the new output file would have:

info from input file 1, line 1 some number of tabs info from input file 2, line 1
.
.
.

If either input file has more lines than the other the rest of the lines should be inserted into the output file in their correct position.

Thanks.

Are you on a *NIX system? `perl -e 'exec @ARGV' paste /tmp/file1 /tmp/file2` :) — pilcrow, Dec 29 '10 at 22:32
@pilcrow: Yes I'm on *NIX and your solution worked just fine. I can't believe I didn't come up with this myself. I did do the research before posting but forgot to consider the paste command. Thanks. — Horace Debussy Jones, Dec 30 '10 at 20:35

score 2 · Answer 1 · answered Dec 29 '10 at 18:54

2

open FP1,"filename1";
open FP2,"filename2";
my ($l1,$l2);
while(1)
{
  $l1=<FP1>; chomp $l1;
  $l2=<FP2>; chomp $l2; 
  last unless(defined $l1 or defined $l2);
  print $l1.$l2,"\n";
}
close FP2;
close FP1;

answered Dec 29 '10 at 18:54

Curd

12,169
3
35
49

9

Please don't provide example code that uses bareword file handles, or the 2 arg form of open, or neglect to check the return of open. All are very discouraged practices. – Ven'Tatsu Dec 29 '10 at 19:58
2

@Ven'Tatsu: ...and I did't say "use strict; use warnings;" and I didn't handle the case that the files can't be opened, etc. bla, bla, bla,... . There are thousand things you always can nag about any Perl script because it doesn't satisfy somebody's taste. Please don't be so picky! – Curd Dec 29 '10 at 21:31
1

@Curd, did I complain about `use strict`? No. I did complain about 3 specific issues that are done in a lot of example code that should not be, they are items that take almost no effort to do correctly, but can create massive hassles when they are done wrong. I specifically noticed but did not comment on a few "nit pick" personal taste issues, because I didn't want to distract from what I consider serious issues. – Ven'Tatsu Dec 29 '10 at 22:18
@Curd, see http://stackoverflow.com/questions/1479741/why-is-three-argument-open-calls-with-lexical-filehandles-a-perl-best-practice for good reasons about two of the points Ven'Tatsu made. – CanSpice Dec 29 '10 at 22:25
@CanSpice: the point is not whether there are good reasons to do things different. The point is whether is is neccessary to nag about it. If it doesn't affect the original question, I think, it is not. Ever heard about TIMTOWTDI? If someone insists in Perl on doing something exactly one way he hasn't understood one of the maxims of the language. – Curd Dec 29 '10 at 23:18

Nathan · Answer 2 · 2010-12-29T22:06:19.957

2

I like hashes for aggregating things. This is quick, if not particularly elegant.

#!perl
use strict; 
use warnings;

my ($file1, $file2) = @ARGV;
die "usage: $0 file1 file2\n" 
    unless $file1 && $file2;

use File::Slurp;
my @a = read_file($file1)
    or die "couldn't read $file1 - $!";
my @b = read_file($file2)
    or die "couldn't read $file2 - $!";

my $combined = {}; # hashref

my $i=0;
foreach (@a) {
    chomp;
    $combined->{$i}{b} = '' unless defined $combined->{$i}{b};
    $combined->{$i++}{a} = $_;
}

$i=0;
foreach (@b) {
    chomp;
    $combined->{$i}{a} = '' unless defined $combined->{$i}{a};
    $combined->{$i++}{b} = $_;
}

foreach my $i (sort {$a<=>$b} keys %$combined) {
    print $combined->{$i}{a}, ("\t" x 2), $combined->{$i}{b}, "\n";
}

edited Dec 29 '10 at 22:06

answered Dec 29 '10 at 21:26

Nathan

3,842
1
26
31

Yes this is very elegant and a good example of applying hashes to implement a solution. Thanks. – Horace Debussy Jones Dec 29 '10 at 21:56
now that I think about it, an array of hashes might be better than this hash of hashes. You wouldn't have to mess with sorting the keys. change `{$i}` to `[$i]` throughout. I was originally thinking hashes to deal with the files being different lengths. – Nathan Dec 29 '10 at 22:16
As well as change each {$i++} to [$i++] and modify the last foreach loop. – Horace Debussy Jones Dec 29 '10 at 22:58

Ven'Tatsu · Answer 3 · 2010-12-29T22:27:31.407

1

This is really no different that looping through one file as long as you pay attention to a few of Perl's tricks. For one file it is common to use

use strict;
use warnings;
use English qw(-no_match_vars);

my $filename = 'foo';
open my $file, '<', $filename or die "Failed to open '$filename' $OS_ERROR\n";

while (my $line = <$file>) {
    # work with $line
}

close $file;

This can be expanded to two files by opening both and changing the loop conditional to only end when both files are done reading. But there is a catch, when Perl sees a simple read from a file handle as the conditional for a while loop it wraps it in defined() for you, since the conditional is now more than a simple read this needs to be done manually.

use strict;
use warnings;
use English qw(-no_match_vars);

my $filename1 = 'foo';
my $filename2 = 'bar';
open my $file1, '<', $filename1 or die "Failed to open '$filename1' $OS_ERROR\n";
open my $file2, '<', $filename2 or die "Failed to open '$filename2' $OS_ERROR\n";

my ($line1, $line2);
while ( do { $line1 = <$file1>; $line2 = <$file2>; defined($line1) || defined($line2) } ) {
    # do what you need to with $line1 and $line2
}

close $file1;
close $file2;

edited Dec 29 '10 at 22:27

answered Dec 29 '10 at 19:55

Ven'Tatsu

3,565
16
18

Very interesting. But won't this loop stop before getting through all the lines of both files if one file has less lines than the other? – Horace Debussy Jones Dec 29 '10 at 20:01
1

Nope. For example, if file 1 has 100 lines and file 2 has 90 lines, then $line2 will only be defined for the first 90 iterations and $line1 will be defined for all 100 iterations. However, the "or" (ie ||) ensures that we loop through all 100 iterations. – Dec 29 '10 at 20:20
1

Note that on the two open statements '>' should be '<'. The two files are input files not output files. A nice way to accidentally blow away the data in your files. – Horace Debussy Jones Dec 29 '10 at 20:32
When I run the example above I get the following uninitialized value errors for $line2 for every input line from the files. Is it possible that since 'defined(my $line1 = <$file1>' is true the '||' immediately returns true and never executes the code to it's right? – Horace Debussy Jones Dec 29 '10 at 20:50
1

**Facepalm** `||` is short-circuiting when if finds the first true value, skipping the second read. I've edited to fix the file modes and use a `do` block to read both files before testing them, but it might be cleaner to use an infinite loop and test inside the loop body like Curd's answer. – Ven'Tatsu Dec 29 '10 at 21:14
In this 2nd example the number of opening and closing parenthesises don't match! BTW: the output is supposed to be a concatenation of each line. Where is this done? – Curd Dec 29 '10 at 21:44

score 0 · Answer 4 · edited Jun 21 '13 at 12:13

#!/usr/bin/env perl

#merging 3 - lines of first file and 3 lines of second file and next of these.
open(F1, "<file1") or die "\ncould not find your file1\n";

my@lines1;@lines1 = < F1 > ;

close(F1);

open(F2, "<file2") or die "\ncould not find your file2\n";

my@lines2;@lines2 = < F2 > ;

close(F2);

my $value;
my $nums;

print "\nplease write your output file name::::\n";

chomp($file = < STDIN > );

open(F3, "> $file") or die "\n could not write into your file\n";

$value = 0;
foreach $nums(@lines1) {


    if ($value % 3 == 0) {

        print F3 $lines2[$value];
        print F3 $lines2[$value + 1];
        print F3 $lines2[$value + 2];

    }
    print F3 $nums;
    $value++;

}
close(F3);

score 0 · Answer 5 · 2010-12-29T20:22:33.013

0

You could first query (with a wc -l) which file has more lines. Assuming (for the sake of pseudocode) that file 1 has more lines, then do the following:

use strict;
use warnings;

open(my $fh,"<","file 1") or die ("Couldn't open file 1: $!");
open(my $write,">","output.csv") or die ("Couldn't open output.csv: $!");

my $str;
my $count=1;

while(my $line=<$fh>)
{
   $str=`head -n $count file 2 | tail -n 1`. (\tx[however many tabs you want]) . $line;
   print $write $str;
   $count++;
}

close($fh);
close($write);

edited Dec 29 '10 at 20:22

answered Dec 29 '10 at 18:56

You're welcome. I did goof up the first part of $str, though (the use of ``echo -n $count file 2`` won't get you what you want). I fixed it. – Dec 29 '10 at 20:23

How Do I Combine Two Files in Perl?

5 Answers5