-4

As starting coders, we are busy with a scraping tool in python. It is almost finished, but now we want the result in a JSON file. We tried but it does not work. Is there a code hero who can help us out?

from bs4 import BeautifulSoup
import urllib

jaren = [str("2010"), str("2012")]
DESIRED_COLUMNS = {1, 2, 5}  # it is a set

for Jaargetal in jaren:
    r = urllib.urlopen("http://www.nlverkiezingen.com/TK" + Jaargetal +".html").read()
    soup = BeautifulSoup(r, "html.parser")
    tables = soup.find_all("table")

for table in tables:
    header = soup.find_all("h1")[0].getText()
    print header

    trs = table.find_all("tr")[0].getText()
    print '\n'
    for tr in table.find_all("tr")[:22]: 
          print "|".join([x.get_text().replace('\n', '') 
      for index, x in enumerate(tr.find_all('td')) 
      if index in DESIRED_COLUMNS])
Patrick
  • 1
  • 2
  • 3
    Is this your actual code? Because you have syntax and indentation issues right now. – idjaw Apr 19 '16 at 11:54
  • @idjaw I've updated the code. Now there are no errors anymore. – Patrick Apr 19 '16 at 12:09
  • You still have indentation issues. Specifically at `r = urllib.urlopen("http://www.nlverkiezingen.com/TK" + Jaargetal +".html").read()`. What should be in that for loop? Should everything underneath `for Jaargetal in jaren` be inside that loop? You should ensure your code is an exact representation of your code that you are running – idjaw Apr 19 '16 at 12:10
  • I'm sorry, now the code should work. There was some trying stuff in it. – Patrick Apr 19 '16 at 12:14
  • 1
    Please look at the code carefully. It is still not indented properly. Look at `for Jaargetal in jaren:`. The code is not indented underneath that line. – idjaw Apr 19 '16 at 12:15

2 Answers2

1

You can write JSON to a file like so:

import json
d = {"foo": "bar"}
with open("output.json", "w") as f:
    json.dump(d, f)
Schore
  • 885
  • 6
  • 16
0

json.dumps() is the function. It converts a dictionary to str object. Docs

sinhayash
  • 2,693
  • 4
  • 19
  • 51