1

i m trying to perform a substitution on the following group of lines :

1AA20160817BBBBBDIGITS1NUMBER1STYLE59        00002200000220
1AA20160817BBBBBDIGITS2NUMBER1STYLE60        00000000000220
1AA20160817DDDDDDIGITS3NUMBER2STYLE60        00000000000486
1AA20160817DDDDDDIGITS4NUMBER2STYLE59        00004860000486
1AA20160817FFFFFDIGITS5NUMBER3STYLE602523111100000000000000
1AA20160817FFFFFDIGITS6NUMBER3STYLE59        00000820000000

I want the final output to be like this :

1AA20160817BBBBBDIGITS1NUMBER1STYLE59        00002200000220
1AA20160817BBBBBDIGITS1NUMBER1STYLE60        00000000000220
1AA20160817DDDDDDIGITS3NUMBER2STYLE60        00000000000486
1AA20160817DDDDDDIGITS3NUMBER2STYLE59        00004860000486
1AA20160817FFFFFDIGITS5NUMBER3STYLE602523111100000000000000
1AA20160817FFFFFDIGITS5NUMBER3STYLE59        00000820000000

The change is one digit, just before "Number" on every second line. The patterns in the style of BBBBB/DDDDD are times, the last character being the seconds indicator.

I want it to check for a specific number of characters and perform the change there, i ve written the sed to do that task and its like :

sed -i.bak "s/^\(.\{1\}\)$scenario$datein\(.\{6\}\)$pod/1$scenario$datein$timein$pod/g" $1

The rest of the code is in Perl. Could one of you help me do the same substitution in Perl? Or perhaps tell me how i can run this sed command, from a perl code? My problem is the files in question are huge, and bash takes too long to read every line, and perform the substitutions. Thanks in advance.

jaco0646
  • 15,303
  • 7
  • 59
  • 83
onlyf
  • 767
  • 3
  • 19
  • 39

2 Answers2

2

Assuming that your input data is in data.txt:

$ perl -i -pe's/(\d)(?=NUMBER)/$1-1/e if ! ($. % 2)' data.txt
  • -i: edit input file in-place and create a backup
  • -p: run this code code every line in the input and print $_ on each iteration
  • -e: code to run
  • s/(\d)(?=NUMBER)/$1-1/e: look for a digit followed by 'NUMBER' and replace it with one subtracted from the digit
  • if ! ($. % 2): but only do it for even numbered records
Dave Cross
  • 68,119
  • 3
  • 51
  • 97
0

Even and odd lines can be identified by looking at $. -- the current line number being read from the (last accessed) filehandle. See it in perlvar.

use warnings;
use strict;

my $set_num_to = 0;

while (<DATA>) 
{
    if ($. % 2 != 0) { # odd line number
        ($set_num_to) = /(\d)NUMBER/;
        print;
    }
    else { 
        s/\d(?=NUMBER)/$set_num_to/;
        print;
    }
}

__DATA__
1AA20160817BBBBBDIGITS1NUMBER1STYLE59        00002200000220
1AA20160817BBBBBDIGITS2NUMBER1STYLE60        00000000000220
1AA20160817DDDDDDIGITS3NUMBER2STYLE60        00000000000486
1AA20160817DDDDDDIGITS4NUMBER2STYLE59        00004860000486
1AA20160817FFFFFDIGITS5NUMBER3STYLE602523111100000000000000
1AA20160817FFFFFDIGITS6NUMBER3STYLE59        00000820000000

The regex uses the string NUMBER, as given in the example and for the lack of more specifics, to identify the digit to fetch on odd lines which is then used to replace the one at the same position on even lines. It uses a positive lookahead, (?=PATTERN). If the replacement is supposed to be one less than the current number (and not the number from the the previous line), you can use

s/(\d)(?=NUMBER)/$1-1/e if $. % 2 == 0;

The /e modifier makes the replacement side be evaluated first and then the result of that is used as replacement. See perlop and this post.

One can use substr instead, if the position is fixed

my $offset = length '1AA20160817BBBBBDIGITS';

while (<DATA>) 
{
    if ($. % 2 != 0) {
        # Retrieve substring of length 1 at given offset
        $set_num_to = substr $_, $offset, 1;
    }
    else {
        # Replace substring of same length at same offset by one captured above
        substr $_, $offset, 1, $set_num_to;
    }
}

The rest is the same and prints lines as specified.

Again, if you need to subtract 1 from what is there rather than replace it with the number from the previous line, you can use both lines above in the $. % 2 == 0 condition.

zdim
  • 64,580
  • 5
  • 52
  • 81