0

I have a web scraper that grabs info and saves it to a database. I use the following code to save data.

try: 
    base['vevo']['url']
except:
    base['vevo']['url'] = "NotGiven"
try: 
    base['vevo']['viewsLastWeek']['data']['time']
except:
    base['vevo']['viewsLastWeek']['data']['time'] = '2199-01-01'

Now normally this works, however ocassionally the data stream doesn't return any info at all for base['vevo']. This breaks the above dict add and says that KeyError 'vevo'.

I've been trolling through other stackoverflow questions, but I haven't been able to find anything that references adding multiple keys at once like I'm trying to do. I've tried to use base.append('key'), tried base.get() but couldn't find a reference on how to use it for multiple keys deep. Any ideas on how to get around it?

RknRobin
  • 391
  • 2
  • 6
  • 21
  • 1
    Have you tried `get()` with a default value, eg : `base.get('vevo', {})` – t.m.adam Apr 10 '17 at 01:10
  • What would I put for the {}? And how would this work for ['vevo']['viewsLastWeek']['data']['time']? I don't fully understand how to use the base.get(), but I feel like it may be the solution – RknRobin Apr 10 '17 at 01:13
  • You can also put another try except like try: base['vevo'] except base['vevo']= "not Found vevo" – Nikos Vita Topiko Apr 10 '17 at 01:13
  • base.get( key, defaultvalue) means that if the key does not exist then the key has this defaultvalue – Nikos Vita Topiko Apr 10 '17 at 01:15
  • Would I have to just keep assigning "NotFound" to every key up the tree then? something like `base['vevo']['viewsLastWeek'] = "NotFound"` , then `base['vevo']['viewsLastWeek']['data'] = "NotFound"` – RknRobin Apr 10 '17 at 01:17

2 Answers2

1

You can use defaultdict.

import collections
def new_level():
    return collections.defaultdict(new_level)
base=new_level()

This would allow you to add an arbitrary number of levels to your nested dicts:

 >>> base["foo"]["bar"]["foobar"]=42
 {'foo': {'bar': {'foobar': 42}}}
Saúl Pilatowsky-Cameo
  • 1,224
  • 1
  • 15
  • 20
  • This seems to have gotten past the first problem, however now when I `print base['vevo']['url']` it gives me `defaultdict(, {}`. It also seemed to do that for all the exisiting values in the dictionary as well. Do I have to do something else to view the actual value afterwords? – RknRobin Apr 10 '17 at 01:37
  • You could use `json` to style the string representation: `print json.dumps(base)`. Or there are other solutions here http://stackoverflow.com/questions/12925052/python-and-default-dict-how-to-pprint – Saúl Pilatowsky-Cameo Apr 10 '17 at 01:40
  • Oh no, wait. What happens is that when you call `base['vevo']['url']` and it doesn't exist, it will call `new_level` and create a dictionary. If you want to check if a key exists use `"url" in base['vevo']` to avoid creating a new nested dictionary. – Saúl Pilatowsky-Cameo Apr 10 '17 at 01:46
0

So I found a solution, but it involved a change in logic instead of what I had originally tried to do.

Since I was only using the dictionary value to save to my database, I could use a place holder variable as an in-between for the functions. See below for the working code..

try: 
    v_url = base['vevo']['url']
except:
    v_url = "NotGiven"

Adding values to the existing dictionary proved to be too complicated, and this solution involves no extra packages.

RknRobin
  • 391
  • 2
  • 6
  • 21