I am relatively new to Python, coming from a Stata background, and am still struggling with some core Python concepts. For example, i'm currently working on a small program that hits the US Census Bureau API to geocode some addresses and my instinct is to loop over my csv files, feed them into the API call, and then name the output(s) sequentially using the iterator. E.g.
import censusgeocode
import json
import pandas as pd
cg = censusgeocode.CensusGeocode()
for i in range(1,3):
k = cg.addressbatch('dta/batchfiles/split_test ' + str(i) + '.csv')
json_string = json.dumps(k)
test_{i} = pd.read_json(json_string)
I know the test_{i}
syntax is incorrect and would return an error but the above gives you a sense of what I am trying to do conceptually. However, i've read elsewhere (e.g. in this SO post) that this is not a good approach in Python. Can someone advise me on what a better approach would be? Is it better to simply append all the k
s together into a massive json file and then transform them in one go? If so, how do I go about doing that?
I have hundreds of csv files that I want to loop over, and after calling the API for each I want to append them all together into a single dataframe -- am not sure if that's useful context but just trying to communicate where I want to get to eventually.
Any help would be very much appreciated!