0

I have input file as following...

A0FGR8 659-660 659-662
A4UGR9 728-751

And I want output as in two different files(A0FGR8.diso, A4UGR9.diso) and these files suppose to consist the numbers

659
660
661
662

after this for loop should terminate based on the upper number, but it is not so.. my perl code as follows..

#!/usr/bin/perl
open(F1,"$ARGV[0]") or die;
chomp(@list = <F1>);
close(F1);

for($i = 0; $i <= $#list; $i++) {
    @id=split(/\s+/,$list[$i]);

    open(OUT,">tmp/$id[0].diso") or die;

    for($j = $id[1]; $j = $#id; $j++) {
        ($n1, $n2) = split(/-/, $id[$j]);
        for($k = $n1; $k <= $n2; $k++) {
            print OUT "$k\n";
        }
    }
}

Kindly tell me where I have committed mistake and resolve the same.

TLP
  • 66,756
  • 10
  • 92
  • 149

2 Answers2

1

Here's a rewrite of your code in saner Perl. All for loops are guaranteed to terminate because I changed them to use the list version of for instead of C-style for, which is a Good Perl Habit, since list-for is a lot harder to get wrong.

I also changed the outermost for to iterate over the list directly instead of looping over the indexes because you didn't actually use the index for anything other than pulling items out of the list anyhow.

The bare <> operator reads from files listed in @ARGV (or from STDIN if @ARGV is empty), so I changed that. There's rarely a need to explicitly open $ARGV[0].

Finally, I changed the open for the output file to use a lexical filehandle and the three-argument form of open. Both are recommended practices. Lexical filehandles avoid globals and are automatically closed when they go out of scope (...and you forgot to close your OUT...), while three-arg open avoids security bugs when giving it filenames coming from outside the program.

I also made it run cleanly under strict and warnings, both of which you should always use.

The revised code:

#!/usr/bin/env perl    

use strict;
use warnings;
use 5.010;

my @list = <>;
chomp @list;

for my $line (@list) {
  my @id = split(/\s+/, $line);

  open my $outfile, '>', "tmp/$id[0].diso"
    or die "Failed to open output file: $!";

  for my $j ($id[1] .. $#id) {
    my ($n1, $n2) = split(/-/, $id[$j]);

    for my $k ($n1 .. $n2) { 
      say $outfile $k;
    } 
  }
}

Unfortunately, I am not able to say whether this will actually work correctly for your data set because you have not provided any sample data to test it against.

Dave Sherohman
  • 45,363
  • 14
  • 64
  • 102
1

Your program is much more clearly written like this

You don't say whether you want overlapping ranges to be combined, but your own code makes no attempt to do that so I have left them as they are, so that 659 and 660 are repeated in the output to A0FGR8.diso

use strict;
use warnings 'all';

while ( <> ) {

    my ($id, @ranges) = split;

    open my $out_fh, '>', "tmp/$id.diso" or die $!;

    for my $range ( @ranges ) {
        my ($start, $end) = split /-/, $range;
        print { $out_fh } "$_\n" for $start .. $end;
    }
}

output

tmp/A0FGR8.diso

659
660
659
660
661
662

tmp/A4UGR9.diso

728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751


Update

Here's an alternative version that uses the Number::Range module to eliminate duplicate values in overlapping ranges. It isn't a core module, and so it is likely that you will need to install it

use strict;
use warnings 'all';

use Number::Range;

while ( <DATA> ) {

    my ($id, @ranges) = split;

    my $range = Number::Range->new;

    for my $interval ( @ranges ) {
        no warnings 'Number::Range';
        $range->addrange( $interval =~ s/-/../r ); #/
    }

    open my $out_fh, '>', "$id.diso" or die $!;
    print { $out_fh } "$_\n" for $range->range;
}


__DATA__
A0FGR8 659-660 659-662

output

tmp/A0FGR8.diso

659
660
661
662
Borodin
  • 126,100
  • 9
  • 70
  • 144