0

I'm using a Foursquare API call to find venues associated with particular ZIP codes in the US.

I am able to generate the JSON with information, but am having trouble looping and parsing to construct a pandas dataframe.

So far:

# scraping the foursquare website for the information we want and obtaining the json file as results

for i, series in df_income_zip_good.iterrows():
    lat = series ['lat']
    lng = series ['lng']
    town = series ['place']
    LIMIT = 100
    radius = 1000
    url4Sqr = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
        CLIENT_ID,
        CLIENT_SECRET,
        VERSION,
        lat,
        lng,
        radius,
        LIMIT)

   venues = requests.get(url4Sqr).json()
   #print results from call
   print (venues)
#https://stackoverflow.com/questions/6386308/http-requests-and-json-parsing-in-python

This works fine and produces the JSON. I've linked the output to a JSON file on GitHub: (https://github.com/adhorvitz/coursera_ibm_capstone/blob/524c6609ea8872e0c188cd373a4778caaadb1cf6/venuedatasample.json)

I am not sure how to best flatten the JSON, then loop to extract the pieces of information I want to load into a dataframe. I've tried to mess around with the following with no success.

def flatten_json(nested_json, exclude=['']):
    """Flatten json object with nested keys into a single level.
        Args:
            nested_json: A nested json object.
            exclude: Keys to exclude from output.
        Returns:
            The flattened json object if successful, None otherwise.
            The code recursively extracts values out of the object into a flattened dictionary. json_normalize can be applied to the output of flatten_object to produce a python dataframe:
    """
    out = {}

    def flatten(x, name='venues', exclude=exclude):
        if type(x) is dict:
            for a in x:
                if a not in exclude: flatten(x[a], name + a + '_')
        elif type(x) is list:
            i = 0
            for a in x:
                flatten(a, name + str(i) + '_')
                i += 1
        else:
            out[name[:-1]] = x

    flatten(nested_json)
    return out
#https://towardsdatascience.com/flattening-json-objects-in-python-f5343c794b10

I then run:

for i in venues():
    json_flat_venues = flatten_json(venues)
    json_flat_venues

An error is produced stating that the 'dict' object is not callable.

I've also tried:

for i in venues():
    df_venues_good = pd.json_normalize(venues)
    df_venues_good

The same error is produced.

I'm a bit lost on where to go, and how to best convert the JSON into a workable DF.

Thanks in advance.

-------update-----------

So I’ve tried a few things.

  1. After I referenced the page left in the comments: https://www.geeksforgeeks.org/flattening-json-objects-in-python/, I installed json_flatten (using pop), but had issues importing flatten.

  2. As an attempt at a work around I tried to re-create the code from the website, adapted to my project. I think I made more of a mess than I cleared up.

  3. I re-ran the original "flatten_json" def (see above). I then assigned df_venues_good without the for loop statement (also above).

  4. With the for loop removed it look like it starts to pulls the first record from the json. However, it looks like metadata (or at least data that I'm not trying to extract).

  5. I also noticed an issue when reviewing the json. In my output (I'm using a Jupyter notebook) cell it looks like all of the records are retrieved (there are about 95 in all).

I then ran this to just dump the file to inspect:

JsonString = json.dumps(venues)
JsonFile = open("venuedata.json", "w")
JsonFile.write(JsonString)
JsonFile.close()

When I open the dump file (which I put linked above) it doesn't look complete.

Any direction would be appreciated.

ADH
  • 47
  • 4
  • 1
    Is this what your looking for: https://www.geeksforgeeks.org/flattening-json-objects-in-python/ – BlackFox Aug 11 '21 at 17:41
  • It very well could be - thanks for the resource. – ADH Aug 11 '21 at 18:26
  • 1
    Ya man let me know if you have more questions so I can try and shoot for an answer for rep points – BlackFox Aug 11 '21 at 23:28
  • Thanks for that offer - yes, I certainly have a few more questions. That resource pointed in a right direction, but I'm still hitting a wall. Best way to get the new questions to you is to create a fresh question, correct (sorry, I'm a bit new tto stackoverflow - learning etiquette)? – ADH Aug 12 '21 at 15:52
  • 1
    You can just edit this question – BlackFox Aug 12 '21 at 16:55
  • 1
    Updated with your insights, tests, errors and questions. – BlackFox Aug 12 '21 at 16:55
  • 1
    Put your code in Github and share it. I will test it for you and walk you through the results. I will show you how to test each line of code Using print(). – BlackFox Aug 14 '21 at 19:54
  • 1
    example what happens when you do this: JsonString = json.dumps(venues).......then print(JsonString)? – BlackFox Aug 14 '21 at 19:55
  • 1
    i also just tested installing and it works. remeber, for python3 you use "pip3 install json-flatten" ( you can look up solutions for this issue on stackoverflow many time over) – BlackFox Aug 14 '21 at 20:03
  • 1
    lastly look up anaconda you have an anaconda python install location...and a local python installs location. its confusing as heck so what i do is test everything local with anaconda not installed...then when i know what i want to do install anaconda and run all steps again. or you can just use a python venv env. ( you can look up solutions for this issue on stackoverflow many time over) – BlackFox Aug 14 '21 at 20:06
  • 1
    here is how your open() part should look: https://gist.github.com/BlackFoxgamingstudio/7421ce96e792950c79ce53499254b407. ( you can look up solutions for this issue on stackoverflow many time over) – BlackFox Aug 14 '21 at 20:11
  • Thank you so much for this. Just posted the code to GitHub with the errors included in the output: [link to notebook].(https://github.com/adhorvitz/coursera_ibm_capstone/blob/d2ebad4ced57b7ddee20f72b703a786605d8b652/Capstone_Project_B2B_Neighborhood_Survey_4sqr_cred_removed.ipynb)_. I am going to continue to read up on the above and try to mess around little bit. Certainly check out the pip3. I've been learning on 3 and 2.7 (two different sources, clearly). I do have anaconda3 running on my machine. Thanks again for all of this. – ADH Aug 15 '21 at 23:51
  • I’m not seeing the python files man(or lady :) – BlackFox Aug 16 '21 at 16:59
  • Ha! Man, but appreciate the ask. Let me try this [github link](https://github.com/adhorvitz/coursera_ibm_capstone/blob/d2ebad4ced57b7ddee20f72b703a786605d8b652/Capstone_Project_B2B_Neighborhood_Survey_4sqr_cred_removed.ipynb). – ADH Aug 16 '21 at 17:08
  • Dang bro…what is the issue your trying to solve here? I found errors that tell you u have a problem. You need to add thes errors to this question so others can help you solve them. Lastly, using a notebook is not the way to get help here. U need the file structure of individual python scripts. The reason is, you can test a script one at a time. And it forces the programmer to brake down an test each part of the solution.your real issues are: one this a course your taking and you should not have stack overflow coders Answer your whole assignment lol (and we won’t) – BlackFox Aug 16 '21 at 17:23
  • Instead clearly state what your doing, what’s wrong and what errors your getting….with out showing us your assignment lol…think about when your on a job…you can’t show the customers code! So you just cropped out the problem area. The problem area I’m seeing is your two errors at the bottom of the long ass notebook – BlackFox Aug 16 '21 at 17:24
  • Finally, why flatten anything?! Maybe my ignorance here…but all you need to do is print the data and transform the data….right? If this was not a school assignment I would tell you to put in the time to learn pandas..it would make all this easier. Also you can learn, Json library, csv library, and request library. All this library can be used to transform data from an api call…some auto flatten Json data….but really it’s not a term or keyword to throw around. – BlackFox Aug 16 '21 at 17:33
  • If you can tell fro, my may comments you got a lot of studying to do and I highly doubt you will find an answer or help on stackoverflow…less you just ask for help with the errors your seeing…..;) – BlackFox Aug 16 '21 at 17:34
  • 1
    Noted - this a for a project to get myself introduced into this world a bit more. I'm just looking for any and all resources, guidance, etc. In addition to other resources I've been reviewing I thought trying to reach out to people who seem to have a good handle on what they are doing. I put the entire thing up for context. Your point about a customer is well taken, but I'm looking to learn, not to sell (but again point taken). I appreciate the insights and the help you're sharing above. – ADH Aug 16 '21 at 17:55

1 Answers1

1

After 4 days of communication I think I see your real question that will get you moving forward. You need to look up and troubleshoot the below two errors. Please mark my answer as correct if you agree amd with some more work troubleshooting on your own, create a new question around your insights, errors and questions around the following images.

valueerror: if using all scaler values, you must an index

enter image description here

Library you can use to help you “flatten Json data” are Pandas, requests, Jsons, and even the library csv can help you here.

Do to the fact you are learning python, data analysis and how to work with api’s you will find little to no more help on stackoverflow with out a more clear description amd examples of your technical issue.

Please continue your self study and keep trying! You got this:)

Plz let us know how the community can help with individual issues and questions as you grow:)

BlackFox
  • 773
  • 1
  • 8
  • 24