0

Is there any way to convert a w3cdom document into a Jsoup document without serializing to a string?

There is a converter to go from Jsoup->w3cdom, but I cannot find one for the opposite direction.

My ultimate goal is to extract information from the w3cdom document using a CSS selector. From my understanding, Jsoup is the strongest option for this, but only seems to be able to ingest a string, which is somewhat expensive to generate.

Ryan Tate
  • 1,553
  • 2
  • 14
  • 21
  • "_using a CSS selector_" - Just to check: Have you looked at (and perhaps already ruled out) XPath? – andrewJames Jul 31 '21 at 19:25
  • @andrewjames The context is enabling the use of arbitrary css selectors by an end user. Xpath selectors are also supported. Of the two, CSS selectors are better known these days, but I think it would be wise to support both. If so, however, I need to allow switching back and forth (in this context), and I’d ideally like to make this efficient. – Ryan Tate Jul 31 '21 at 22:21
  • 1
    Understood. Given JSoup also accepts input streams, can [document-to-input-stream](https://stackoverflow.com/questions/865039/how-to-create-an-inputstream-from-a-document-or-node) help? (I have not tried it in this context.) – andrewJames Aug 01 '21 at 16:35
  • @andrewjames thanks, that at least could be faster than serializing out the whole string at once, which is looking like the only other known solution…… – Ryan Tate Aug 01 '21 at 17:44

0 Answers0