4

There are several questions and tips about getting the DOM into Spider-, Trace- or JaegerMonkey. But has anyone done this? Is somewhere out there a in C embedable working JS-Engine including the DOM? OR at least a easy to do tutorial?

christian
  • 425
  • 5
  • 15
  • I have compiled Jägermonkey an played a litte bit with the shell. What I can see, there is some kind of XML parser:`code js> var note='This is a noteThis is a second note'; js> var xdoc=new XML(note); js> print(xdoc.a[0]); This is a note js> print(xdoc.a[1]); This is a second note.` But you can not "getElementByTagName" or things .. but lets see what this parser can do ... – christian Feb 09 '11 at 19:13
  • Simon Jester has answered your question. Please accept it. Without HTML rendering it won't do you much good unless you are writing a page generator/parser/validator or serverside application. –  Mar 31 '12 at 03:23

2 Answers2

2

I suggest looking into envjs or xmljs. These are full DOM implementations written directly in ECMAScript.

Simon Jester
  • 303
  • 2
  • 5
0

A DOM and a JS engine makes a web browser. Look into embedding Webkit or something.

bdonlan
  • 224,562
  • 31
  • 268
  • 324
  • Hmm, for my understandings a DOM and JS Engine and a Rendering-Engine and Networking is a Webbrowser. For my requrements there is no need to render or networking (Http/Ftp/...). – christian Feb 09 '11 at 07:48
  • The DOM contains a lot of information that can only be obtained by rendering and obtaining network resources - the exact pixel location of various elements, the size of images, etc. As such, in the general case you'd need a full browser. You can always take a full browser and remove bits from it if you like, though. But, this might be missing the point - what do you need the HTML DOM for? – bdonlan Feb 09 '11 at 07:52
  • I want to build a console based Web-Scrapper in C. All the networking stuff and XML cleanings for xslt are allready done and working. But I want to add JavaScript to parse, calculate and extract data from Html. – christian Feb 09 '11 at 08:17
  • @christian, unfortunately, I think most people have gone the route of binding a full web-browser toolkit and just disabling the bits they don't need. It's easier that way than trying to build a HTML parser (harder than it sounds) and DOM, then binding it to javascript from scratch. You could also consider, e.g., taking the chromium source code and sort of removing all the display rendering stuff. – bdonlan Feb 09 '11 at 10:20
  • 1
    @bdonlan, ok assuming this is the way to go. This Question is asked several times. I can not beleve noone has done this allready?! I mean building some kind of firefox_light or chrome_light ready to use API ... Maybe I have to change my key words and start a new googeling session... – christian Feb 09 '11 at 11:09
  • That's roughly what webkit _is_... :) – bdonlan Feb 09 '11 at 11:14