I am trying to find some specific information in an XML tag and convert it to a json string. I have come up with the most convoluted solution, but it almost works. I just need to remove the whitespace and line breaks. I have tried however that results in even my values to run together.
Sample data:
<config>
<derivedFrom>
<courseName>Family and Medical Leave</courseName>
<courseCode>FML</courseCode>
<courseAuthor>Company 1</courseAuthor>
<courseVersion>2.0.0</courseVersion>
<importLocale>en-US</importLocale>
</derivedFrom>
</config>
This is the sed code I am using:
sed -n '
/<derivedFrom>/ {
:a;
N;
/<\/derivedFrom>/!ba;
s/.*<derivedFrom>//;
s/<\/derivedFrom>//;
s/<\/[a-zA-Z]*>/",/g;
s/</"/g;
s/>/":"/g;
s/[[:space:]]//g;
s/,$//g;
p
}'
And finally, here is my current output is "courseName":"FamilyandMedicalLeave","courseCode":"UBM2C","courseAuthor":"Alchemy","courseVersion":"2.0.021","importLocale":"en-US"
I know I need to replace [[:space:]]
with something else as I don't want text in my quotes to run together, but I am stuck. For example: Family and Medical Leave should keep its spaces. There is probably also an easier way to do this with some XML to JSON script. However, I need to do this without needing to install anything else onto the servers.