0

I'am currently trying to remove from an xml file all nodes containing a specific pattern.

I need to remove all block :

<artifactItem>...</artifactItem> containing libprogram-* in <artifactId>libprograms-*</artifactId>

Here is the xml (only the beggining) :

 <?xml version="1.0" encoding="UTF-8"?>
    <project xmlns="http://maven.apache.org/POM/4.0.0"
             xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
             xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
        <modelVersion>4.0.0</modelVersion>
        <!--
        <parent>
            <groupId>lu.foyer.exzos</groupId>
            <artifactId>exzos</artifactId>
            <version>1.0-SNAPSHOT</version>
        </parent>
        -->
        <groupId>lu.foyer.exzos</groupId>
        <artifactId>foyer-pms</artifactId>
        <version>1.0-SNAPSHOT</version>
        <name>Foyer Pms</name>

        <build>
            <plugins>
                <plugin>
                    <groupId>org.apache.maven.plugins</groupId>
                    <artifactId>maven-dependency-plugin</artifactId>
                    <version>3.1.2</version>
                    <executions>
                        <execution>
                            <id>copy</id>
                            <phase>package</phase>
                            <goals>
                                <goal>copy</goal>
                            </goals>
                        </execution>
                    </executions>
                    <configuration>
                        <!--outputDirectory>${project.basedir}/run/programs</outputDirectory>
                        <overWriteReleases>false</overWriteReleases>
                        <overWriteSnapshots>true</overWriteSnapshots>
                        <overWriteIfNewer>true</overWriteIfNewer>
                        <excludeTransitive>true</excludeTransitive-->
                        <artifactItems>
                            <!-- Programs dependencies -->
                            <artifactItem>
                                <groupId>lu.foyer.exzos</groupId>
                                <artifactId>XactAnubexSyncJavaSupLib</artifactId>
                                <version>1.0-SNAPSHOT</version>
                                <outputDirectory>${project.basedir}/run/programs</outputDirectory>
                                <destFileName>XactAnubexSyncJavaSupLib.jar</destFileName>
                            </artifactItem>
                            <artifactItem>
                                <groupId>exzos.bazel</groupId>
                                <artifactId>libprograms-client-cobol</artifactId>
                                <version>1.0-SNAPSHOT</version>
                                <outputDirectory>${project.basedir}/run/programs</outputDirectory>
                                <destFileName>libprograms-client-cobol.jar</destFileName>
                           </artifactItem>
                        ...

I tried several sed commands but without success.

Any help would be appreciated.

Thanks

  • The above aside, what Michael said in his answer. `sed` is **not** the right tool for the job. – tink May 07 '20 at 09:27
  • The referenced question and its answer perfectly illustrate the limitations of sed: the answer only works if the required element is all on one line. – Michael Kay May 07 '20 at 09:30
  • [Don't Parse XML/HTML With Regex.](https://stackoverflow.com/a/1732454/3776858) I suggest to use an XML/HTML parser (xmlstarlet, xmllint ...). – Cyrus May 07 '20 at 10:20

1 Answers1

0

sed is the wrong tool for the job. Use XSLT, or some other XML-aware tool such as xmlstarlet or Saxon's Gizmo.

For example, in Gizmo:

java net.sf.saxon.Gizmo -s:input.xml
/>delete //artifactItem[matches(., 'libprogram-*')]
/>save output.xml
/>quit
Michael Kay
  • 156,231
  • 11
  • 92
  • 164