0

I have to extract value (i.e. Value) from large XML file based on other value (Package Name ), first removing duplicates then run a loop over it.

grep -i 'abc.jar' /tmp/<filename> ==> removing duplicates also

output=>

<Package Name="abc.jar="OUI" Version="1.o">
<Property Name="InstallLocation" Value="/<some path>/abc.jar"/>
<Package Name="abc.jar" Evidence="OUI" Version="1.0">
<Property Name="InstallLocation" Value="/<some path/abc.jar"/>
<Package Name="abc.jar" Evidence="OUI" Version="1.0">
<Property Name="InstallLocation" Value="/<some path>/abc.jar"/>

I am able to extract all Package Name with below command but unable to proceed further.

grep -P -o -e '(?<=Package Name=").*?(?=")' <filename>

abc.jar
abc.jar
xyz.ear
xyz.ear
....contd
Maroun
  • 94,125
  • 30
  • 188
  • 241
  • Possible duplicate of [How to parse XML in Bash?](https://stackoverflow.com/q/893585/608639) – jww Feb 05 '19 at 06:32

1 Answers1

0

You can pipe to sort -u:

grep -P -o -e '(?<=Package Name=").*?(?=")' <filename> | sort -u

The -u option:

 -u, --unique
         Unique keys.  Suppress all lines that have a key that is equal to an already processed one.
         This option, similarly to -s, implies a stable sort.  If used with -c or -C,
         sort also checks that there are no lines with duplicate keys.
Maroun
  • 94,125
  • 30
  • 188
  • 241