1

I am writing a class to take a Chrome bookmarks file (see example below):

{
   "checksum": "452bebcad611a3faffb2c009099139e5",
   "roots": {
      "bookmark_bar": {
         "children": [ {
            "date_added": "13028719861473329",
            "id": "4",
            "name": "first bookmark",
            "type": "url",
            "url": "chrome://newtab/"
         }, {
            "children": [ {
               "children": [ {
                  "date_added": "13026904508000000",
                  "id": "7",
                  "name": "Getting Started",
                  "type": "url",
                  "url": "https://www.mozilla.org/en-GB/firefox/central/"
               } ],
               "date_added": "13028740260032410",
               "date_modified": "0",
               "id": "6",
               "name": "Bookmarks Toolbar",
               "type": "folder"
            }, {
               "children": [ {
                  "date_added": "13026904508000000",
                  "id": "9",
                  "name": "Help and Tutorials",
                  "type": "url",
                  "url": "https://www.mozilla.org/en-GB/firefox/help/"
               }, {
                  "date_added": "13026904508000000",
                  "id": "10",
                  "name": "Customise Firefox",
                  "type": "url",
                  "url": "https://www.mozilla.org/en-GB/firefox/customize/"
               }, {
                  "date_added": "13026904508000000",
                  "id": "11",
                  "name": "Get Involved",
                  "type": "url",
                  "url": "https://www.mozilla.org/en-GB/contribute/"
               }, {
                  "date_added": "13026904508000000",
                  "id": "12",
                  "name": "About Us",
                  "type": "url",
                  "url": "https://www.mozilla.org/en-GB/about/"
               } ],
               "date_added": "13028740260032410",
               "date_modified": "0",
               "id": "8",
               "name": "Mozilla Firefox",
               "type": "folder"
            }, {
               "date_added": "13026904551000000",
               "id": "13",
               "name": "Welcome to Firefox",
               "type": "url",
               "url": "http://www.mozilla.org/en-US/firefox/24.0/firstrun/"
            } ],
            "date_added": "13028740260004410",
            "date_modified": "0",
            "id": "5",
            "name": "Imported From Firefox",
            "type": "folder"
         } ],
         "date_added": "13028719626916276",
         "date_modified": "13028719861473329",
         "id": "1",
         "name": "Bookmarks bar",
         "type": "folder"
      },
      "other": {
         "children": [  ],
         "date_added": "13028719626916276",
         "date_modified": "0",
         "id": "2",
         "name": "Other bookmarks",
         "type": "folder"
      },
      "synced": {
         "children": [  ],
         "date_added": "13028719626916276",
         "date_modified": "0",
         "id": "3",
         "name": "Mobile bookmarks",
         "type": "folder"
      }
   },
   "version": 1
}

I convert from JSON to a nested dictionary then extract the bookmark urls under each relevant bookmark folder in my write_data method.

As there can be any number of bookmark folders and/or bookmarks nested within each folder, I want to call the write_data method within itself so that it keeps on extracting child data every time it finds a nested folder. I just can't work out how to pass the relevant child dictionaries into the same method.

I've tried building up the dictionary path with a string. I think I need to pass in a tuple or list of keys to loop through and dynamically build up the path but I can't get it working and my poor head is wrecked!

There is a similar question but the answer uses yield which has confused me totally and was not a totally working solution anyway. Please help!

import json
import sys
import codecs


class FileExtractor(object):
    def __init__(self, input_file):
        self.infile = codecs.open(input_file, encoding='utf-8')
        self.bookmark_data = json.load(self.infile)

    def write_data(self, my_key):

        for key, value in self.bookmark_data[my_key].iteritems():
            if type(self.bookmark_data[my_key][key]) is dict:
                print self.bookmark_data[my_key][key]['name']
                for subkey, subvalue in self.bookmark_data[my_key][key].iteritems():
                    if subkey == "children" and len(self.bookmark_data[my_key][key][subkey]) <> 0:
                        print "this is a child. I can't figure out how to use write_data with this"
                        #self.write_data('[my_key][key][subkey]')               
                    else:
                        print subkey, ": ",  self.bookmark_data[my_key][key][subkey]

if(__name__=="__main__"):
    stuff= FileExtractor(sys.argv[1])

    stuff.write_data(('roots'))
Community
  • 1
  • 1
CiCi
  • 13
  • 3
  • 1
    I didn't understand what exactly you're trying to achieve. Can you post your desired output? – georg Nov 13 '13 at 10:57
  • thanks thg435. what I am trying to get is a list of bookmarks and their attributes, name,date added etc under the heading of each folder/subfolder. I also want to list the folder attributes, name, date added etc – CiCi Nov 13 '13 at 11:18

1 Answers1

0

Not sure if this is what you are getting at but if you pass in the dictionary object itself to write_data rather than the key you can recurse as far down as the dictionary goes. WARNING I haven't tested this, it's just to give you an idea.

def write_data(self, my_dict=None):
    my_dict = my_dict or self.bookmark_data['roots']
    for key, value in my_dict.items():
        if type(my_dict[key]) is dict:
            print my_dict[key]['name']
            for subkey, subvalue in my_dict[key].items():
                if subkey == "children" and len(my_dict[key][subkey]) <> 0:
                    for child in my_dict[key][subkey]:
                        self.write_data(child)           
                else:
                    print subkey, ": ",  my_dict[key][subkey]

better version:

def write_data(self, my_dict=None):
    my_dict = my_dict or self.bookmark_data['roots']
    if 'name' in my_dict:
        print my_dict['name']

    for key, value in my_dict.items():
        if type(my_dict[key]) is dict:
            self.write_data(my_dict[key])
        elif type(my_dict[key]) is list:
            for item in my_dict[key]:
                self.write_data(item)
        else:
            print key, ": ",  my_dict[key]
gonkan
  • 259
  • 1
  • 7
  • Thanks so much dgalvin. I did try this but I get the following when I try to pass in the whole object: Traceback (most recent call last): File "./file_extractor.py", line 48, in stuff.write_data(self.bookmark_data['roots']) NameError: name 'self' is not defined – CiCi Nov 13 '13 at 11:16
  • maybe don't pass in the self.bookmark_data['roots'] initially just use that if my_dict is None, i'll edit the answer. You're basically trying to use self from outside the class, use stuff.bookmark_data instead so you are referencing the instance. – gonkan Nov 13 '13 at 11:20
  • That runs fine dgalvin thanks so much for trying but for some reason it doesn't actually print out the attributes for the bookmarks under bookmark_bar – CiCi Nov 13 '13 at 11:32
  • updated with a better method of doing this, this should work for what you want – gonkan Nov 13 '13 at 11:57
  • ur a star dgalvin thank you. It works great except it jumbles up the attributes of folders and their bookmarks a bit (a reflection of how the data is stored rather than your code) so I may need to rethink what I do with this data! This has put me miles ahead though :) – CiCi Nov 13 '13 at 12:21
  • no problem, dicts aren't ordered remember, you'll get the keys in the order they come out. If you want them ordered maybe add them to a list and sort that. – gonkan Nov 13 '13 at 12:23