2

I am running into a problem with an xml string in my application.

I keep getting an invalid Char value 11 error on my xml string.

But when I opened the file in notepad++ I noticed the unicode character was being shown as a VT block or a vertical tab which you can recreate with the alt+011 code.

I already looked on here a bit but the only answer I saw was to use this on the string:

preg_replace ('/[^\x{0009}\x{000a}\x{000d}\x{0020}-\x{D7FF}\x{E000}-\x{FFFD}]+/u', ' ', $string);

But that already happens in my code so I am at a loss of what to do right now. I also added these codes to the above regex pattern: \x{0B}\x{000B}\x{2B7F}\x{011}\x{0011} which I found while looking for the VT block in notepad++.

After further investigation of the previous version of my app, which uses the same way of building the XML file I found out that it works perfectly fine in that version.

Any help is appreciated.

Ferdi van der Woerd
  • 366
  • 1
  • 4
  • 22
  • Check [this answer](https://stackoverflow.com/a/49948110/2834978) about VT on python, might help you. – LMC May 17 '18 at 00:46
  • @LuisMuñoz I checked the answer, but in a previous version of the app this exact same code does work and changes the unicode character into a space without being wrapped in CData blocks. – Ferdi van der Woerd May 17 '18 at 07:14
  • The VT has to be encoded, replaced or remove, enclosing in CDATA will not solve the issue. You can file a regression bug perhaps. – LMC May 17 '18 at 13:05

1 Answers1

2

I solved it, we used DOmDocument first and since that broke I added the new hex codes to look for. But when I removed those extra codes and used SimpleXml it worked fine.

Ferdi van der Woerd
  • 366
  • 1
  • 4
  • 22