As I understand your request, tags of simple XML can be condensed with something like this:
#!/bin/bash
if [ $# -lt 1 ]; then echo "no file provided"; exit 1; fi
xml_input="$1"
if [ ! -r ${xml_input} ]; then echo "file not readable"; exit 1; fi
xml_temp="$(mktemp /tmp/${xml_input}.XXXXXXXXX)" || exit 1
tr '\n' ' ' < "${xml_input}" > "${xml_temp}"
sed -i 's/\r/ /g' "${xml_temp}"
sed -i 's/ */ /g' "${xml_temp}"
sed -i 's/?> /?>/g' "${xml_temp}"
sed -i 's/?>/?>\n/g' "${xml_temp}"
sed -i 's/> </>\n</g' "${xml_temp}"
mv "${xml_temp}" "${xml_input}"
which will convert:
<?xml version="1.0" encoding="UTF-8"?><root>
<lineid>
Product
testing machine
</lineid>
<lineid>Product testing machine
</lineid>
</root>
to:
<?xml version="1.0" encoding="UTF-8"?>
<root>
<lineid> Product testing machine </lineid>
<lineid>Product testing machine </lineid>
</root>
but a proper shell script to do that for all XML cases would be huge, or just a caller for an actual parser written in another language. There are a lot of good explanations:
https://stackoverflow.com/a/8577108/1919793
Can you provide some examples of why it is hard to parse XML and HTML with a regex?
Why it's not possible to use regex to parse HTML/XML: a formal explanation in layman's terms
and many text editors will do this a lot better for you:
How do I format XML in Notepad++?