-1

So I have a large file like the following:

RESOURCETAGMAPPINGLIST  arn:aws:ec2:us-east-1:XXXXXX:instance/i-XXXXXXXXXXXXXXXXX
TAGS    app-name    appname1
RESOURCETAGMAPPINGLIST  arn:aws:ec2:us-east-1:XXXXXX:instance/i-XXXXXXXXXXXXXXXXX
TAGS    app-name    appname2
RESOURCETAGMAPPINGLIST  arn:aws:ec2:us-east-1:XXXXXX:instance/i-XXXXXXXXXXXXXXXXX
TAGS    app-name    appname1
..

I only want to modify the line with RESOURCETAGMAPPINGLIST and print the the other lines w/out modification. Then I want to print only specific fields on the matching like, like below:

arn ec2 us-east-1 XXXXXX
TAGS    app-name    appname1
arn ec2 us-east-1 XXXXXX
TAGS    app-name    appname2
arn ec2 us-east-1 XXXXXX
TAGS    app-name    appname1
..

I was trying using awk gsub command, but really could not get the -F: part to work out. Any help would be greatly appreciated and it does not matter if it's awk, sed or perl.

Cyrus
  • 84,225
  • 14
  • 89
  • 153
akula
  • 9
  • 1
  • 2
    please update the question with your `awk` attempt(s) and the (wrong) output generated by your code – markp-fuso Jan 20 '23 at 23:49
  • um ... so how do you get `arn ec2 us-east-1 XXXXXX` from the rest of the line in the file? why is `aws` not there? (the second field in colon-separated part?) – zdim Jan 21 '23 at 02:18
  • Are those (mulitple?) spaces or tabs between fields ... do they need to be preserved in output? – zdim Jan 21 '23 at 02:29

5 Answers5

4

With awk. I used as input field separators at least one space ( +) or (|) one colon (:).

If a row contains string RESOURCETAGMAPPINGLIST then print columns 2, 4, 5, 6 and stop processing this row and continue with next row. If a row does not contain RESOURCETAGMAPPINGLIST then print complete row unchanged.

awk -F ' +|:' '/RESOURCETAGMAPPINGLIST/{print $2,$4,$5,$6; next} {print}' file

Output:

arn ec2 us-east-1 XXXXXX
TAGS    app-name    appname1
arn ec2 us-east-1 XXXXXX
TAGS    app-name    appname2
arn ec2 us-east-1 XXXXXX
TAGS    app-name    appname1

See: The Stack Overflow Regular Expressions FAQ

Cyrus
  • 84,225
  • 14
  • 89
  • 153
2

One awk idea using the default field delimiter and split():

awk '
/RESOURCETAGMAPPINGLIST/ { split($2,a,":")              # split 2nd field on ":" delimiter, storing results in array a[]
                           print a[1],a[3],a[4],a[5]
                           next                         # skip to next line of inpu
                         }
1                                                       # print current line
' sample.dat

# or as a one-liner sans comments:

awk '/RESOURCETAGMAPPINGLIST/ {split($2,a,":"); print a[1],a[3],a[4],a[5]; next} 1' sample.dat

This generates:

arn ec2 us-east-1 XXXXXX
TAGS    app-name    appname1
arn ec2 us-east-1 XXXXXX
TAGS    app-name    appname2
arn ec2 us-east-1 XXXXXX
TAGS    app-name    appname1
markp-fuso
  • 28,790
  • 4
  • 16
  • 36
1

Here is one way to do it:

#!/usr/bin/perl
use v5.30; # Perl v5.10 or above is required to use 'say' instead of 'print'
use warnings;

my $file = "datafile.txt"; #declare file name
open( my $fh, "<", $file ) or die( "Can't open `$file`: $!\n" );
#open the file with file handle $fh
while (my $line = <$fh>){ #read the file line by line
    chomp $line;      #remove line breaks
    if ($line =~ m/^RESOURCETAGMAPPINGLIST/){ #if line start with
        my @fields = split(/\t|:/, $line); #split in tab or : characters; each field becomes an element of @fields array
        say (join ' ', (@fields[1,2,3,4])); #join the relevant fields with a single space and print
    }else{
        say $line; #if line does not start with RESOURCETAGMAPPINGLIST, simply print it
    }
}
#when script finishes, Perl will automatically close the input file.
Supertech
  • 746
  • 1
  • 9
  • 25
1

I assume that one keeps the fields 1,3,4,5 from the colon-separated part.

perl -wple'
    s{^RESOURCETAGMAPPINGLIST\s+(.+)}
     { join " ", ( $1 =~ /([^:]+)/g )[0,2..4] }e' file

With -p the variable $_, which has the line to process, is printed after the processing.

If that keyword isn't matched the regex does nothing and $_ remains unchanged. If it matches then the whole line does and it, so $_, gets replaced with the return of the code in the replacement side. (With /e modifier the replacement side is evaluated as code, which extracts the select words separated by : in the rest of the line, captured in $1, and joins them by space.)

Or: test for the word, and either split the line and join the needed parts or print it as is

perl -wnlE'say 
    /^RESOURCETAGMAPPINGLIST/ ? join " ", (split /\s+|:/)[1,3..5] : $_' file

These are broken into lines so they are easier to read. They can be copy-pasted as they are (in bash), or brought onto one line. Or, this can be written far more nicely in a file, but the question seems to be asking for a command-line program.

zdim
  • 64,580
  • 5
  • 52
  • 81
0

This might work for you (GNU sed):

sed -E 's/^RESOURCETAGMAPPINGLIST *([^:]+):[^:]+:([^:]+):([^:]+):([^:]+).*/\1 \2 \3 \4/' file

Pattern match and reformat as required.

potong
  • 55,640
  • 6
  • 51
  • 83