2

I want to use a Java library to parse HTML.

I also want to be able to get the applied css style for the text element. Currently I only want to know the font style and size.

The style could be applied directly or from a css file. I realize it can also be applied using JavaScript but I don't need that support as of now.

Currently I am looking at Jsoup but I don't see such support. Are there any other libraries that I can use? It would be preferable if I don't have to use a full browser engine to get this functionality.

arahant
  • 2,203
  • 7
  • 38
  • 62

1 Answers1

3

The CSSParser at least allows to parse a CSS document and iterate the different style rules. In the answer to the question Looking for a CSS Parser in java you will find an example usage.

CSSParser is a Java implementation of W3C's SAC: The Simple API for CSS

SAC 1.0 is a standard interface for CSS parser and supposed to work with CSS1, CSS2, CSS3 (currently under development) and other CSS derived languages.

But this would force you to roll your own implementation between JSoup and CSSParser. The only project which implements a getComputedStyle method in Java is the Lobo Java Browser. Unfortunately discontinued since 2009, but I don't think that this yields a problem.

At least they offer this method:

public org.lobobrowser.html.style.AbstractCSS2Properties getComputedStyle(java.lang.String pseudoElement)

Community
  • 1
  • 1
Konrad Reiche
  • 27,743
  • 15
  • 106
  • 143
  • Thanks. I think that should work. The example usage on that link seems to only load the CSS and parse, instead of applying it to an HTML document. Anyway, I'll have a closer look and update this page. – arahant Aug 06 '12 at 20:45
  • 1
    @arahant You are right, CSSParser cannot apply this on a HTML DOM. The only project that I am aware of is Lobo Java Browser. They have implemented a Java equivalent of the [JavaScript method `getComputedStyle`](https://developer.mozilla.org/en-US/docs/DOM/window.getComputedStyle?redirectlocale=en-US&redirectslug=DOM%3Awindow.getComputedStyle). See my edited answers for additional links. – Konrad Reiche Aug 06 '12 at 21:08