Kinda new at scripting. I'm mostly a C# coder but...
I have a an XML file that contains a lot of nodes with repeated names but they all have ".txt" in the value
Scan.xml
<Parent Tags>
...
<FileNameWithPath> Some/Path/That/has/file.extension.txt</FileNameWithPath>
...
</Parent Tags>
...
<Parent Tags>
...
<FileNameWithPath> Some/NewPath/That/has/Newfile.DifferentExtension.txt</FileNameWithPath>
...
</Parent Tags>
I'm trying to write a (bash) script in Linux to remove all the ".txt" substrings within the file.
testing things out, I have
cat IpScan.xml | sed -ne '/<FileNameWithPath>/s#\s*<[^>]*>\s*##gp'
but this only displays the value of the tag in the terminal.
I've also tried something like this
grep -oP "<FileNameWithPath>(.*)</FileNameWithPath>" IpScan.xml | cut -d ">" -f 2 | cut -d "<" -f 1
My thinking is to loop through each result of sed or grep and process the end of the string but then I don't know how to write the value back to the file. Also, I'm not sure grep or sed allows you to iterate (??)
My Question is this: How can I open the file, change the value of the element to remove the ".txt" string and save the file with the updated values?
I would prefer not to have to install another package as the Linux box I'm working on does not have network connectivity.
How can I