13

From a Linux bash script, I want to read the structured data stored by a particular Firefox add-on called FB-Purity.

I have found a folder called .mozilla/firefox/b8eab5j0.default/storage/default/moz-extension+++37a9788c-671d-4cae-ba5c-fbdb8788499a^userContextId=4294967295/ that contains a .metadata file which contains the string moz-extension://37a9788c-671d-4cae-ba5c-fbdb8788499a, an URL which when opened in Firefox shows the add-on's details, so I am pretty sure that this folder belongs to the add-on.

That folder contains an idb directory, which sounds like Indexed Database API, a W3C standard apparently used since last year by Firefox it to store add-ons data.

The idb folder only contains an empty folder and an SQLite file.

The SQLite file, unfortunately, does not contain much application structured data, but the object_data table contains a 95KB blob which probably contains the real structured data:

INSERT INTO `object_data` VALUES (1,'0pmegsjfoetupsf.742612367',NULL,NULL,
X'e08b0d0403000101c0f1ffe5a201000400ffff7b00220032003100380035003000320022003a002
2005300610074006f0072007500200055007205105861006e00690022002c00220036003100350036
[... 95KB ...]
00780022007d00000000000000');

Question: Any clue what this blob's format is? How to extract it (using command line or any library or Linux tool) to JSON or any other readable format?

Nicolas Raoul
  • 58,567
  • 58
  • 222
  • 373
  • Fairly confident that this was not how indexedDB was intended to be used and that this is a vendor implementation detail that is not specified in any spec. – Josh Feb 28 '19 at 15:09
  • 2
    @Josh: The question was not about whether it *should* be done this way. If your trying to modify extension data while the browser is not running accessing the DB like this is the **only** way; just cause it ain't pretty doesn't mean it cannot be done or doesn't have any legitimate use-cases. – ntninja Jan 26 '20 at 07:59

1 Answers1

22

Well, I had a fun day today figuring this out and ended creating a Python tool that can read the data from these indexedDB database files and print them (and maybe more at some point): moz-idb-edit

To answer the technical parts of the question first:

  • Both the name key (name) and data (value) use a Mozilla proprietary format whose only documentation appears to be its source code at this time.
  • The keys use a special just-for-this use-case encoding whose rough description is available in mozilla-central/dom/indexedDB/Key.cpp – the file also contains the only known implementation. Its unique selling point appears to be the fact that it is relatively compact while being compatible with all the possible index types websites may throw at you as well as being in the correct binary sorting order by default.
  • The values are stored using SpiderMonkey's internal StructuredClone representation that is also used when moving values between processes in the browser. Again there are no docs to speak of but one can read the source code which fortunately is quite easy to understand. Before being added to the database however the generated binary is compressed on-the-fly using Google's Snappy compression which “does not aim for maximum compression [but instead …] aims for very high speeds and reasonable compression” – probably not a bad idea considering that we're dealing with wasteful web content here.
  • To locate the correct indexedDB file for an extension's local storage data, one needs to resolve the extension's static ID to a so-call “internal UUID” whose value is different in every browser profile instance (to make tracking based on installed addons a lot harder). The mapping table for this is stored as a pref (“extensions.webextensions.uuids”) in the prefs.js. The IDB path then is ${MOZ_PROFILE}/storage/default/moz-extension+++${EXT_UUID}^userContextId=4294967295/idb/3647222921wleabcEoxlt-eengsairo.sqlite

For all practical intents and purposes you can read the value of a single storage key of any extension by downloading the project mentioned above. Basic usage is:

$ ./moz-idb-edit --extension "${EXT_ID}" --profile "${MOZ_PROFILE}" "${STORAGE_KEY}"

Where ${EXT_ID} is the extension's static ID (check its manifest.json file or look in about:support#extensions-tbody if your unsure), ${MOZ_PROFILE} is the Firefox profile directory (also in about:support) and ${STORAGE_KEY} is the name of the key you'd like to query (unfortunately querying all keys is not supported yet).

Also writing data is not currently supported either.

I'll update this answer as I implement more features (or drop me an issue on the project page!).

ntninja
  • 1,204
  • 16
  • 20
  • god I love you. do you know how to parse values that have `file_ids` values and corresponding files? – user4893106 Mar 22 '21 at 23:30
  • @user4893106: So haven't investigated this, simply since I had no real-world example of them being used. My guess would be they are blob:-files and the contents are either 1:1 the blob's contents or a snappy-compressed version there-of. If you'd like help with this please open an issue on the linked repository, this isn't a great place for feature-requests or other discussions of that kind. – ntninja Mar 30 '21 at 21:29
  • how can you see that the format is proprietary and at the same time say that it's in the (opensource) code? if it can be parsed by opensource code, then it can't be proprietary – user1623521 Aug 26 '21 at 05:03
  • @user1623521: Proprietary as in *not standardized*, *unspecified* or *for internal use (by Firefox/Mozilla) only*. It's not a contradiction to have an open-source **implementation** of a proprietary **format**: Would you say the format used by Adobe Photoshop (.psd) is *open* because GIMP has a semi-complete read/write implementation of it? (At least the parts covered by GIMP have an open-source implementation now obviously…) – ntninja Aug 31 '21 at 16:02
  • IMO you can't compare PSD with this because the first implementation of the latter is proprietary and GIMP's implementation only exists to try to emulate the proprietary one; anyway, I love that you created that moz-idb-edit repo, and in a completely unrelated subject, are you looking for a remote job by any chance? I got some positions you might be interested in :) – user1623521 Sep 02 '21 at 02:47
  • Thank you! Expect a pull request which extends this to the ability to read sites’ data as well, not just extensions’. And it’ll autodiscover the default profile. – mirabilos Aug 26 '23 at 16:31