1

My question is about scala.xml and the conversion from the Node type to String(using whatever buildString or toString methods).

When I do such a conversion into the standard output I get a list of warning about the input, something like this:

line 2 column 2 - Warning: unknown attribute "myAttribute1"
line 3 column 9 - Warning: unknown attribute "myAttribute2"
line 6 column 9 - Warning: unknown attribute "myAttribute3"
line 7 column 13 - Warning: unknown attribute "myAttribute4"
line 7 column 13 - Warning: unknown attribute "myAttribute5"
line 13 column 25 - Warning: <th> attribute "width" has invalid value "3%"
line 15 column 36 - Warning: <th> attribute "width" has invalid value "5%"
line 17 column 36 - Warning: <th> attribute "width" has invalid value "3%"
line 19 column 36 - Warning: <th> attribute "width" has invalid value "15%"
line 21 column 36 - Warning: <th> attribute "width" has invalid value "3%"
line 23 column 36 - Warning: <th> attribute "width" has invalid value "3%"
line 25 column 36 - Warning: <th> attribute "width" has invalid value "3%"
line 27 column 36 - Warning: <th> attribute "width" has invalid value "3%"
line 29 column 36 - Warning: <th> attribute "width" has invalid value "3%"
line 35 column 22 - Warning: unknown attribute "data-col-count"
line 41 column 15 - Warning: inserting missing 'title' element
InputStream: Document content looks like HTML 4.01 Transitional
18 warnings, no errors were found!
The table summary attribute should be used to describe
the table structure. It is very helpful for people using
non-visual browsers. The scope and headers attributes for
table cells are useful for specifying which headers apply
to each table cell, enabling non-visual browsers to provide
a meaningful context for each cell.
For further advice on how to make your pages accessible
see "http://www.w3.org/WAI/GL". You may also want to try
"http://www.cast.org/bobby/" which is a free Web-based
service for checking URLs for accessibility.

So, my question is if there's a way to get rid of such an output. The code generating this is just:

data.buildString(true).getBytes(StandardCharsets.UTF_8)

Where data is the Scala.xml.Node Thanks

EDIT

I tried to parse from Node to String and back into the REPL but I don't get any of those warnings.

For parsing I use a custom object extending XMLLooader from scala.xml with a SAXParser with setValidation(false). Then I just use loadString(input) to get my node.

Look at the documentation of SAXParser I end up here: https://docs.oracle.com/javase/7/docs/api/org/xml/sax/helpers/DefaultHandler.html#warning(org.xml.sax.SAXParseException) where it says, no action provided for warnings in the default case.

Benkio
  • 47
  • 6
  • Can you explain a bit more about the setup of your application and add some code (for example your custom object extending XMLLoader)? Do you load html/xml from an external source, and do you recognize the attributes that are mentioned? (like 'myAttribute1') – Simon Groenewolt Oct 14 '18 at 19:04
  • after a little bit of investigation, I discovered the answer from Simon solved. There was a tidy into the application. – Benkio Oct 15 '18 at 11:54

1 Answers1

0

It looks like your output is from JTidy like described in this question about removing warnings, or maybe some other 'tidy' library.

When I try your line:

data.buildString(true).getBytes(StandardCharsets.UTF_8)

It doesn't give me any of the output you get, maybe another part of the program is causing this?

Simon Groenewolt
  • 10,607
  • 1
  • 36
  • 64
  • I already saw that post of JTydy, but as soon as I don't use it I don't know where to set those properties. I edited the question with more informations about the parsing, if it can help. If you can point me at where/what to change into `scala.xml` or `SAXParser` those property I will give it a try. – Benkio Oct 14 '18 at 11:13
  • I discovered there was a tidy somewhere. – Benkio Oct 15 '18 at 11:53