5

Marionette is a protocol for remotely controlling Mozilla browsers. Chromium has the DevTools protocol for the same purpose, and it is documented here.

Marionette has some sketchy documentation here, but is there a proper list of available commands and parameters? Can it be extracted from Mozilla sources somehow? (Like Chromium has PDL.)

By commands I mean the likes of [0,1,"WebDriver:Navigate",{"url":"http://awe.lv"}]}] , "WebDriver:GetTitle", [0,2,"WebDriver:ExecuteAsyncScript",{"script":"alert('Hello!')"}] , "WebDriver:GetWindowHandle", "WebDriver:GetWindowRect", "WebDriver:TakeScreenshot" and "WebDriver:GetPageSource". In particular, I want to observe the network traffic like with the DevTools' method Network.enable.

Are there any other prefixes (than "WebDriver:") available? Can we use the Web APIs via Marionette?

MKaama
  • 1,732
  • 2
  • 19
  • 28

2 Answers2

4

Played around a bit with geckodriver and Wireshark:

Start Firefox with --marionette; personally I like to add --headless --no-remote --profile $(mktemp -d), but this is up to you! Opens FireFox listening on port 2828 (there is a way to change this, but I'm not 100% sure how).

The Marionette protocol is as follows:

  • each message is a length-prefixed json message without newline (so for instance, when you connect telnet localhost 2828, you're greeted by 50:{"applicationType":"gecko","marionetteProtocol":3}, the 50 meaning the json is 50 bytes long.
  • each message (except for the first one) are a json array of 4 items:
    • [0, messageId, command, body] for a request, where messageId is an int, command a string and body an object. Example (with length prefix) 31:[0,1,"WebDriver:NewSession",{}]
    • [1, messageId, error, reply] for a reply. Here messageId is the id the reply was to, and either error or result is null (depending on whether there is an error). E.g. 697:[1,1,null,{"sessionId":"d9dbe...", ..., "proxy":{}}}]
  • A full list of all commands can be found in the Marionette source code, and it seems to me that all functions there are pretty well documented. For one thing, it seems that they expose all webdriver functions under WebDriver:*.

Update: it seems that https://bugzilla.mozilla.org/show_bug.cgi?id=1421766 is also struggling with finding the right marionette port/setting the port. The way I now do it:

TEMPD="$(mktemp -d)"
echo 'user_pref("marionette.port", 0);' >  "${TEMPD}"/prefs.js
/Applications/Firefox.app/Contents/MacOS/firefox-bin --marionette --headless --no-remote --profile "${TEMPD}" &
PID=$!
MARIONETTE_PORT=""
while [ -z "$MARIONETTE_PORT" ]; do
  sleep 1
  MARIONETTE_PORT=$(lsof -a -p $PID -s TCP:LISTEN -i4 -nP | grep -oE ':\d+ \(LISTEN\)' | grep -oE '\d+')
done
echo "Marionette started on port $MARIONETTE_PORT"
fg

(Giving port 0 makes Firefox choose a random empty port. The command works for MacOS; will probably need some tweaking on Linux (I think the arguments to lsof are slightly different; also Linux grep has lookbehind/lookahead, so you could replace the double grep by a single one).


UPDATE 2

Since Firefox will write the Marionette port it uses to "${TEMPD}"/prefs.js, one does not even need to do "fancy" things with lsof; rather just check that file:

TEMPD="$(mktemp -d)"
echo 'user_pref("marionette.port", 0);' >  "${TEMPD}"/prefs.js
/Applications/Firefox.app/Contents/MacOS/firefox-bin --marionette --headless --no-remote --profile "${TEMPD}" &
MARIONETTE_PORT=""
while [ -z "$MARIONETTE_PORT" ]; do
  sleep 1
  MARIONETTE_PORT=$(cat "${TEMPD}"/prefs.js | grep 'user_pref("marionette.port"' | grep -oE '[1-9][0-9]*')
done
echo "Marionette started on port $MARIONETTE_PORT"
fg
Claude
  • 8,806
  • 4
  • 41
  • 56
3

OK, I found a list of commands at Geckodriver, pointed to by source. But what a meager set of commands, and the documentation is also not complete! I hoped one could use all the Web APIs via Marionette.

MKaama
  • 1,732
  • 2
  • 19
  • 28
  • Webdriver is meant to allow browser automation by providing APIs that allows one to mimic human behavior like navigate to page, move mouse, input text and click on elements and a way to check page's reactions to that actions. That's why only selector querying from DOM API is implement, the rest of APIs can be accessed by evaluating arbitrary JS on the page which is supported by geckodriver. – hldev Oct 02 '21 at 00:54
  • Webdriver target use is web page test automation, not an alternative way to navigate on the web. – hldev Oct 02 '21 at 01:01