0

I'm trying to write a bash script that will change the fill color of certain elements within SVG files. I'm inexperienced with shell scripting, but I'm good with regexes (...in JS).

Here's the SVG tag I want to modify:

<!-- is the target because its ID is exactly "the.target" -->
<path id="the.target" d="..." style="fill:#000000" />

Here's the bash code I've got so far:

local newSvg="" # will hold newly-written SVG file content
while IFS="<$IFS" read tag
do
    if [[ "${tag}" =~ +id *= *"the\.target" ]]; then
        tag=$(echo "${tag}" | sed 's/fill:[^;];/fill:${color};/')
    fi
    newSvg="${newSvg}${tag}"
done < ${iconSvgPath} # is an argument to the script

Explained: I'm using read (splitting the file on < via custom IFS) to read the SVG content tag by tag. For each tag, I test to see if it includes an id property with the exact value I want. If it doesn't, I add this tag as-is to a newSvg string that I will later write to a file. If the tag does have the desired ID, I'll used sed to replace fill:STUFF; with fill:${myColor};. (Note that my sed is also failing, but that's not what I'm asking about here.)

It fails to find the right line with the test [[ "${tag}" =~ +id *= *"the\.target" ]].

It succeeds if I change the test to [[ "${tag}" =~ \"the\.target\" ]].

I'm not happy with the working version because it's too brittle. While I don't intend to support all the flexibility of XML, I would like to be tolerant of semantically irrelevant whitespace, as well as the id property being anywhere within the tag. Ideally, the regex I'd like to write would express:

  • id (preceded by at least one whitespace)
  • followed by zero or more whitespaces
  • followed by =
  • followed by zero or more whitespaces
  • followed by "the.target"

I think I'm not delimiting the regex properly inside the [[ ... =~ REGEX ]] construction, but none of the answers I've seen online use any delimiters whatsoever. In javascript, regex literals are bounded (e.g. / +id *= *"the\.target"/), so it's straightforward beginning a regex with a whitespace character that you care about. Also, JS doesn't have any magic re: *, whereas bash is 50% magic-handling-of-asterisks.

Any help is appreciated. My backup plan is maybe to try to use awk instead (which I'm no better at).


EDIT: My sed was really close. I forgot to add + after the [^;] set. Oof.

Tom
  • 8,509
  • 7
  • 49
  • 78
  • 3
    [Don't Parse XML/HTML With Regex.](https://stackoverflow.com/a/1732454/3776858) I suggest to use an XML/HTML parser (xmlstarlet, xmllint ...). – Cyrus Apr 23 '20 at 07:16
  • 1
    That is an incredible answer. Yeah, regexes have limits. I believe this is the same reason why it's practically impossible to write a regex that tests for all syntactically valid email addresses. That said, this is such a simple use-case that I don't want to add a dependency. – Tom Apr 23 '20 at 07:24
  • 1
    @Tom : You don't _delimit_ the regexp. You can represent a literal space always as `\ ` (i.e. backslash, followed by space). – user1934428 Apr 23 '20 at 07:36
  • So... Would `[[ "${tag}" =~ id\ *=\ *"the\.target" ]]` be closer to what you're looking for? Options to the `test` command are separated by spaces. – ghoti Apr 23 '20 at 15:16

3 Answers3

1

It would be much easier if you define regular expression pattern in a variable :

tag='      id  =   "the.target"'
pattern=' +id *= *"the\.target"'

if  [[ $tag =~ $pattern ]]; then
    echo matched.
fi
Philippe
  • 20,025
  • 2
  • 23
  • 32
  • This isn't the exact solution I used, and there's wisdom in steering me towards a proper XML parser for my larger goal, but this is probably the best response that engages directly with the question as written: "regex test in bash that starts with spaces". I also like how it encourages extraction of magic values. Thanks! – Tom Apr 25 '20 at 18:21
1

Thank you for giving us such a clear example that regex is not the way to solve this problem.

A SVG file is an XML file, and a possible tool to modify these is xmlstarlet.

Try this script I called modifycolor:

#!/bin/bash
# invoke as: modifycolor <svg.file> <target_id> <new_color>

xmlstarlet edit \
  --update "//path[@id = '$2']/@style" --value "fill:#$3" \
  "$1"

Assuming the svg file is test.svg, invoke it as:

./modifycolor test.svg the.target ff0000

You will be astonished by the result.

If you want to paste a piece of code inside your bash script, try this:

target="the.target"
newSvg=$(xmlstarlet edit \
  --update "//path[@id = '${target}']/@style" --value "fill:#${myColor}" \
  "${iconSvgPath}")
Pierre François
  • 5,850
  • 1
  • 17
  • 38
0

Thanks to folks for pointing out the mistakes in my bash-fu, I came up with this code which does what I said I wanted. I will not be marking this as the accepted answer because, as folks have observed, regex is a bad way to operate on XML. Sharing this for posterity.

local newSvg="" # will hold newly-written SVG code
while IFS="<$IFS" read tag
do
  if [[ "${tag}" =~ \ +id\ *=\ *\"the\.target\" ]]; then
    tag=$(echo "${tag}" | sed -E 's/fill:[^;]+;/fill:'"${color}"';/')
  fi
  newSvg="${newSvg}${tag}"
done < ${iconSvgPath}

Fixes:

  1. escape the whitespace in the regex: =~ \ +id\ *=\ *
  2. for sed, switch to double-quotes for the variable in the pattern
  3. also for sed, I added the -E extended regex flag in order to support the negated set [^;]

Re: XML, I'll be comparing the list of available CLI-friendly XML parsers to the set of tools commonly available on my users' machines.

Tom
  • 8,509
  • 7
  • 49
  • 78