Getting FIPS block codes from 16million+ lat/long

Question

This seems to be a common question, without any easily-digestible / easily-implementable answers. Many people reference the FCC API, but I don't know how to use an API and haven't found a simple explanation to help me in this situation. R code I can do, Python I can do (if it's simple), but it really seems like there should be some relatively simple resource for taking a .csv (or similar) with lat/long columns, and getting FIPS codes back (at the block group level, from the 2010 census).

Potential solutions (and my issues with them):

This github I believe queries the old FCC API, which is decommissioned. Either way, when I run it on the example given it throws the error Error in fromJSON(content, handler, default.size, depth, allowComments, : invalid JSON input. Furthermore, I wonder how it will do if mapped over 16 million coordinates
This SO question works great on a few rows, and I've implemented it for cases where I only need a couple thousand queries, but I've gotten the error Error in curl::curl_fetch_memory(url, handle = handle) : Timeout was reached: Send failure: Connection was reset and Error in call_geolocator_latlon(row["GE_LATITUDE_2010"], row["GE_LONGITUDE_2010"]) : Service Unavailable (HTTP 503), which I assume are due to my data being too big.
The solution here doesn't seem like it would be best at first glance to me, since it involves downloading shapefiles which just seems inefficient, but since I actually only have observations in CA it should work, except that when I change it to give me 2010 block group geographies, it breaks:
- ca <- tidycensus::get_decennial(state = "CA", geography = "block group", variables = "B00001_001", geometry = TRUE, year = 2010)

Ideally, I'd like to find/write a function that allows me to input my the name of my dataframe and the columns that have my latitude and longitude data in them, and that then adds a column with the FIPS code (at the block group level, from the 2010 census) Alternatively, somewhere I can just upload a .csv and get a .csv back would be great. Or a python package that is easily implementable by someone with very limited python knowledge. Etc, etc, etc.

sample dataframe (for R):

testdata <- structure(list(unique_id = c(5392085L, 14789082L, 11023930L, 4005454L, 13701322L, 10821557L, 11397828L, 15709999L, 475895L, 1546307L), GE_LATITUDE_2010 = c(38.272084, 33.013099, 39.019289, 33.992753, 32.6104, 33.717793, 34.550265, 32.842897, 33.754883, 38.461337), GE_LONGITUDE_2010 = c(-122.644619, -117.05967, -121.006352, -118.26259, -117.057227, -118.044996, -117.277502, -116.890541, -116.983093, -121.389269)), row.names = c(NA, -10L), class = "data.frame")

why do you think downloading shapefiles would be inefficient; have you tried it? — SymbolixAU, Feb 18 '20 at 03:12
Downloading a shapefile of block groups in California might take a little bit, but nowhere near as long as geocoding 16 million rows. Plus it's free. It's helpful if you can be more specific than "tidycensus breaks"; luckily I work with Census data, and can notice right away that's an ACS variable number where you want a decennial one. If you don't actually need to census data, just the shapefile, just download it from the Census TIGER site (or use `tigris`, which `tidycensus` calls) — camille, Feb 18 '20 at 04:01
About the actual calculation: make a spatial object from your coordinates. I like `sf` for this. Take the shapefile of block groups (from the Census Bureau) and do a spatial overlay. If you no longer need the spatial data, just the ID, coords, and BG FIPS, drop the rest — camille, Feb 18 '20 at 04:04

user15720816 · Answer 1 · 2023-03-01T17:25:35.787

if I understand your question correctly, you have lat and lon data and you want the FIPS codes associated with the coordinates.

to do that with Python you can do the following:

your sample df:

unique_id=['5392085L', '14789082L', '11023930L', '4005454L', '13701322L', '10821557L', 
'11397828L', '15709999L', '475895L', '1546307L']
GE_LATITUDE_2010=[38.272084, 33.013099, 39.019289, 33.992753, 32.6104, 33.717793, 
34.550265, 32.842897, 33.754883, 38.461337]
GE_LONGITUDE_2010=[-122.644619, -117.05967, -121.006352, -118.26259, -117.057227, 
-118.044996, -117.277502, -116.890541, -116.983093, -121.389269]


df=pd.DataFrame()


df['unique_id'] = unique_id
df['GE_LATITUDE_2010'] = GE_LATITUDE_2010
df['GE_LONGITUDE_2010'] = GE_LONGITUDE_2010

df


import urllib, json, requests
import pandas as pd
def get_fips_num(df):
    df_1=df[['GE_LONGITUDE_2010','GE_LATITUDE_2010','unique_id']]
    fips_lst=[]
    unique_id=[]
    for i,e,o in df_1.itertuples(index=False):
        try:
            lo=i
            la=e
            ven=o
            link='https://geo.fcc.gov/api/census/area?lat={0}&lon={1}&format=json'.format(la,lo)
            reponse_1 = requests.get(link).json()

            x=reponse_1['results'][0]['block_fips']
            #print(x)
            if len(x) != 0:
                fips_lst.append(x)
                unique_id.append(o)

        except Exception as error:
            print("error type: /" +str(error))

    df_result = pd.DataFrame()
    df_result['unique_id'] =unique_id
    df_result['fips'] = fips_lst 
    return df_result


    df_1=df[['GE_LONGITUDE_2010','GE_LATITUDE_2010','unique_id']]

when you run the code on your df you should get the below df:

    get_fips_num(df)

[enter image description here][1]


  [1]: https://i.stack.imgur.com/dERnA.png

This answer has multiple issues: First, `dataframe` is not populated, then although link works if you open in browser, it denies request sent through `urllib.request.urlopen` — Gaurav Singhal, Feb 10 '23 at 07:14
dont know what happened but I have fixed the code check again — user15720816, Mar 01 '23 at 17:05

Getting FIPS block codes from 16million+ lat/long

1 Answers1