1

I have a set of svg images that I would like to modify with a bash script. In particular, I would like to remove specific paths from the image by preserving the others. This is a working example:

<path
  id="rect5505"
  d="M 118.34375,20 C ... z"
  style="fill:#4d8ecb;stroke:none;fill-opacity:1" />
<path
  style="fill:#000000;fill-opacity:0.23529412;stroke:none;display:inline"
  d="m ... z"
  id="path17954"
  sodipodi:nodetypes="ssccccccccccccssccccccs" />

In this case, there are two path elements, but I would like to remove only the one that has fill:#000000;fill-opacity:0.23529412. I know that I can use sed in order to identify the line that has these fields, but how can I remove the whole path field?

I would like to underline that this issue is not a duplicate of Add/remove xml tags using a bash script, since in that case there was a single field type. By using something like

sed -i '/<path/,/<\/>/d' image.svg

I would remove every single path in the file, isn't it?

Community
  • 1
  • 1
alecive
  • 177
  • 1
  • 14

1 Answers1

4

Editing XML with sed is not a terribly good idea. My suggestion is to use xmlstarlet:

xmlstarlet ed -d '//path[contains(@style, "fill:#000000") and contains(@style, "fill-opacity:0.23529412")]' filename.xml

Where

xmlstarlet ed -d xpath filename.xml

deletes from filename.xml those elements that match the given XPath expression, and

//path[contains(@style, "fill:#000000") and contains(@style, "fill-opacity:0.23529412")]

is an XPath expression that matches all path nodes that have a style attribute that contains both fill:#000000 and fill-opacity:0.23529412.

Addendum: Since contains() does simple string comparison, the XPath expression used above may, in some corner cases, yield false positives. For example, if the style attribute contains the string foo-fill:#000000 , contains(@style, "fill:#000000") will be true. A simple if somewhat unwieldy way around this particular problem is to use

xmlstarlet ed -d '//path[contains(concat(";", @style, ";"), ";fill:#000000;") and contains(concat(";", @style, ";"), ";fill-opacity:0.23529412;")]' filename.xml

...although this still leaves the issue of whitespaces. I suppose a perfect solution would have to parse the CSS as well as the XML, which would take more doing (and probably Perl or so).

Addendum 2: It appears we have namespace SNAFU. To fix that, use

xmlstarlet ed -N svg='http://www.w3.org/2000/svg' -d '//svg:path[contains(@style, "fill:#000000") and contains(@style, "fill-opacity:0.23529412")]' accessibility.svg

That is:

xmlstarlet ed -N svg='http://www.w3.org/2000/svg' -d xpath filename

with xpath amended to use svg:path instead of path. This is necessary because there is an attribute

xmlns="http://www.w3.org/2000/svg"

in the svg tag of the input file, and XPath requires that to be handled.

Wintermute
  • 42,983
  • 5
  • 77
  • 80
  • Thanks for the suggestion @Wintermute. I am trying to use it on this image (one of the many): https://raw.githubusercontent.com/alecive/FlatWoken/master/FlatWoken/scalable/apps/accessibility.svg . The path I would like to remove is there, but the script gives me an error: `None of the XPaths matched; to match a node in the default namespace use '_' as the prefix (see section 5.1 in the manual). For instance, use /_:node instead of /node` – alecive Mar 24 '15 at 18:00