0

So I have this xml from my Plex server that I want to handle with python.. I got everything working, but the xml sheet generated by the Plex server is not well-formed. If/when the summary contains words wrapped in " " my script fails.

See the code below in the MediaContainer/Video summary="... "don't" ..."

<MediaContainer>
  <Video summary="Stupid text I "don't" care about...">
      <Media id="Some id">
        <Part id="Id I want"/>
      </Media>
      <Director tag="Name"/>
      <Writer tag="Name"/>
  </Video>
</MediaContainer>

I know that the proper way, the way it should have been made, would be something like..

<Video summary="Stupid text I &quot;don't&quot; care about...">

I don't care about the text at all so if there is any work around or simply a script that removes all quotes which is inside quotes I would be fine with that.

Another thing I realise now is that if the word is infact don't the single quotation mark ( ' ) in the word don't might course even further problems...

I would prefer a solution in Python(Python 3 if possible), but any help would be greatly appreciated.

Joe
  • 25
  • 2
  • This is really something that should be fixed *in* the Plex server, not after the faulty XML has already been generated. – chepner Mar 13 '19 at 14:47
  • When searching the web for solutions that is exactly what others have been told... If I could fix the issue at the source I would but I don't have the skill, knowledge or determination to do so.. – Joe Mar 13 '19 at 14:52
  • If there was an easy fix, XML wouldn't require quotes to be encoded in the first place. Consider something ``. Is that an element with two attributes, or should it be "fixed" to ``? – chepner Mar 13 '19 at 15:00

1 Answers1

0

XML requires you to use entity to encode certain characters, incl. quotes:

<Video summary="Stupid text I &quot;don't&quot; care about...">

Then you will have it well-formed.

Marcin Orlowski
  • 72,056
  • 11
  • 123
  • 141