3

I need to get some data from the Meetup API, convert the JSON I get into a CSV, all of that in Python 3. I've never worked with JSON or Python, so I've run into some issues. Getting the data is not a problem, but I can't seem to parse the JSON object into a proper CSV file. My code, anonymized:

import json
import requests
import csv


def main():
    # initialize variables
    output_file = 'result_meetup.csv'
    api_key = "YOUR_MEETUP_API_KEY"
    params = {'key': api_key}
    url_path = 'http://api.meetup.com/pro/:URLNAME/groups'

    # get data from API, parse to JSON
    data = requests.get(url_path, params=params)
    data_parsed = data.json()
    length_data = len(data_parsed) - 1

    data_to_file = open(output_file, 'w')
    csv_writer = csv.writer(data_to_file)

    for i in range(0, length_data):
        meetup = data_parsed[i]
        csv_writer.writerow([meetup])
    data_to_file.close()


if __name__ == "__main__":
    main()

Now, as you can see, I write into the CSV as a list. This leads to a really ugly output, looking like this (again, the values in caps are anonymized):

{u'rsvps_per_event': 0.0, u'organizers': [{u'permission': u'organizer', u'name': u'NAME', u'member_id': ID}], u'repeat_rsvpers': 0, u'topics': [{u'lang': u'en_US', u'id': ID, u'urlkey': u'socialnetwork', u'name': u'Social Networking'}, {u'lang': u'en_US', u'id': ID, u'urlkey': u'education', u'name': u'Education'}, {u'lang': u'en_US', u'id': ID, u'urlkey': u'newtech', u'name': u'New Technology'}, {u'lang': u'en_US', u'id': ID, u'urlkey': u'business-referral-networking', u'name': u'Business Referral Networking'}], u'upcoming_events': 0, u'gender_female': 0.3499999940395355, u'pro_join_date': DATE, u'id': ID, u'category': [{u'shortname': u'career-business', u'sort_name': u'Career & Business', u'id': 2, u'name': u'Career & Business'}], u'city': u'CITY', u'member_count': 73, u'lon': LON, u'organizer_photo': {u'thumb_link': u'LINK.JPEG', u'base_url': u'URL', u'id': ID, u'type': u'member', u'photo_link': u'LINK.JPEG', u'highres_link': u'LINK.JPEG'}, u'average_age': 35.555599212646484, u'status': u'Active', u'description': u'DESCRIPTION' u'founded_date': DATE, lat': LAT, u'urlname': u'NAME', u'gender_male': 0.6000000238418579, u'name': u'NAME', u'country': u'Portugal', u'gender_unknown': 0.05000000074505806, u'past_events': 0, u'gender_other': 0.0, u'past_rsvps': 0}

So basically, the whole JSON object in a single CSV field, with weird 'u's, in lists and so on. However, if I don't write it as a list, I only get the fields of the JSON object, without the data, so it would just be 'Organizers', 'Name' etc, without the actual name. As I said, I am a python beginner and haven't found any libraries to help me with it, but I'm sure they exist. Any help is really appreciated, and it would be great if it was Python3 compatible.

Edit: What I would like it to look like in the end: I get multiple Meetup groups in the reply, all having the same structure as shown above. Therefore, the description of the value should be listed just once, as a header, and the values listed beneath (new lines and pipes indicating a new field of a csv sheet):

RSVPs_per_event | Organizer | ID 
5               | Tom       | 1
20              | Jack      | 2
35              | Anne      | 3
alena
  • 61
  • 1
  • 1
  • 12
  • JSON can have a heavily nested structure, a CSV cannot easily represent that. I have run your response through [jsonlint](https://jsonlint.com/) (which, although this is actually a dictionary, will give a readable format) and there's quite a bit of info in there. You'll need to show your expected output – roganjosh Sep 21 '17 at 18:28
  • Ideally, since all JSON objects in my reply have the same structure, it would be great to have the description of the value, like 'Organizer', just once in a header line, and then the values of each object listed beneath that. – alena Sep 21 '17 at 18:31
  • Something like: **Organizer** | Tom | Jack | etc, but underneath each other, instead of next to each other (can't do that formatting in a reply) – alena Sep 21 '17 at 18:31
  • Have you looked at the response in jsonlint? If that is the whole response then it is far from regular, and if that's a single unit of the response, you'll need quite a bit more detail in how that's being represented in a CSV. You should edit your question to give a proper output format, adding comments rarely adds clarity for stuff like that - I am no clearer now on your expected output than I was before. – roganjosh Sep 21 '17 at 18:34
  • Will do, thank you. And I didn't know about jsonlint, I will check it out – alena Sep 21 '17 at 18:38
  • It's not really meant for this, since you actually have a python dictionary so it will immediately throw an error, but it's still a simple way of pretty printing your response (even with showing an error) so you can actually make sense of it. – roganjosh Sep 21 '17 at 18:40

2 Answers2

0

To convert the json data to csv you need to extract keys and write them in header and then work on the values. This might help you:

data_parsed = json.loads(Data)

header = data_parsed[0].keys()
csv_writer.writerow(header)

for i in range(0,length_data)
    meetup = data_parsed[i].values()
    csv_writer.writerow([meetup])
v.coder
  • 1,822
  • 2
  • 15
  • 24
  • not sure this would work given that the value of `organizers` is a list of dictionaries – gold_cy Sep 21 '17 at 18:34
  • I tried this solution, and unfortunately, the result doesn't look much better. It takes away the '{u'rsvps_per_event': ' at the beginning, but the rest looks the same. Thank you though – alena Sep 21 '17 at 18:45
  • may be the json is not parsed correctly. Can you try `data_parsed = json.loads(Data)` – v.coder Sep 21 '17 at 18:55
  • if you can paste the output of `data_parsed`, i can modify the answer accordingly. – v.coder Sep 21 '17 at 18:58
  • If I try data_parsed = json.loads(data), I get an error: "TypeError: the JSON object must be str, bytes or bytearray, not 'Response'", that's why I did it the other way – alena Sep 22 '17 at 08:55
0

If anyone else has the same problem, I solved it, not very elegantly, but I needed the data. The JSON was too nestled and complex to parse, so now I just read the fields I need from the data.

import json
import requests
import csv


def main():
    # initialize variables
    output_file = 'result_meetup.csv'
    api_key = "YOUR_API_KEY"
    params = {'key': api_key}
    url_path = 'http://api.meetup.com/pro/:URLNAME/groups'

    # get data from API, parse to JSON
    data = requests.get(url_path, params=params)
    data_parsed = data.json()
    length_data = len(data_parsed) - 1

    data_to_file = open(output_file, 'w', newline='')
    csv_writer = csv.writer(data_to_file, delimiter=";")
    csv_writer.writerow(["id","name","city","country","member count","average age","founded_date","past_rsvps","rsvps_per_event","repeat_rsvpers","gender_unknown","gender_female","gender_male","gender_other"])

    for i in range(0, length_data):
        meetup = data_parsed[i]
        id = meetup['id']
        name = meetup['name']
        city = meetup['city']
        country = meetup['country']
        member_count = meetup['member_count']
        average_age = meetup['average_age']
        founded_date = meetup['founded_date']
        past_rsvps = meetup['past_rsvps']
        rsvps_per_event = meetup['rsvps_per_event']
        repeat_rsvpers = meetup['repeat_rsvpers']
        gender_unknown = meetup['gender_unknown']
        gender_female = meetup['gender_female']
        gender_male = meetup['gender_male']
        gender_other = meetup['gender_other']
        csv_writer.writerow([id,name,city,country,member_count,average_age,founded_date,past_rsvps,rsvps_per_event,repeat_rsvpers,gender_unknown,gender_female,gender_male,gender_other])
    data_to_file.close()


if __name__ == "__main__":
    main()
alena
  • 61
  • 1
  • 1
  • 12