-2

Tried modifying the attribute value in xml using sed, but it didn't work as expected. For Eg, I want to update the value of data attribute in doc element.

<doc_details> 
   <map>
     <doc name="doc_name" data="doc_value" />
   <map>
</doc_details>

Below sed command is not replacing the expected value. I really don't have much clue what went wrong as I'm new to bash script

sed -i "s/^<doc name=\"doc_name\".*/<doc name=\"doc_name\" value=\"new_value\"><\/doc>/g" inputFile

Please note that I don't want to use xmlstarlet as that might not be installed in the server.

Krupa
  • 193
  • 2
  • 23
  • 6
    `sed` is the wrong tool for this job. Use `xmlstarlet` instead ([example](https://stackoverflow.com/questions/46390426/xmlstarlet-replace-xml-node-value)). – ceving Aug 22 '21 at 12:13
  • 1
    You say "update", but there is no `value` attribute here in the string. Do you mean "add or update if exists"? – Wiktor Stribiżew Aug 22 '21 at 12:23
  • 4
    [Don't Parse XML/HTML With Regex.](https://stackoverflow.com/a/1732454/3776858) I suggest to use an XML/HTML parser (xmlstarlet, xmllint ...). – Cyrus Aug 22 '21 at 12:32
  • Yea understood, sed is not a recommend way. But it's a small use case, can't install another package (xmlstarlet) on server. – Krupa Aug 22 '21 at 12:46
  • 2
    You have `^` in your search, so `sed` is expecting that to be the start of the line when it is not. – Jack Aug 22 '21 at 13:00
  • As Jack says above, remove the `^` from the front of the first regex. Also remove the `g` at the end of the `sed` command since you only have one string to substitute. – Pierre François Aug 22 '21 at 15:02

1 Answers1

1

The way to start debugging something like that is just to remove bits of the regexp until it DOES match. That'll give you a big clue which specific part of your regexp is the problem and it'll probably be trivial to figure out how to fix it from there.

Always use single quotes around shell strings and scripts unless you NEED double quotes, see https://mywiki.wooledge.org/Quotes. In this case you don't need double quotes around the script and using them is forcing you to have to escape all the double quotes within the script. Using / as the regexp delimiter is also forcing you to escape all /s within the script - use a different char, e.g. : or #. Also, until you have a good grasp of the fundamentals, make sure to copy/paste every script you write into http://shellcheck.net and fix the issues it tells you about.

As for your main problem, as others mentioned, don't anchor a regexp if you don't want it anchored. Try this:

$ sed -E 's:(<doc name="doc_name").*(/>):\1 value="new_value"\2:' file
<doc_details>
   <map>
     <doc name="doc_name" value="new_value"/>
   <map>
</doc_details>

The requires a sed that has -E to enable EREs but since you're already using GNU sed for -i that'll work fine. With any sed you could do:

$ sed 's:\(<doc name="doc_name"\).*\(/>\):\1 value="new_value"\2:' file
<doc_details>
   <map>
     <doc name="doc_name" value="new_value"/>
   <map>
</doc_details>
Ed Morton
  • 188,023
  • 17
  • 78
  • 185