I am working on a project for a third party. I'm utilizing census data to support their own data to help gain better insights. I'm still in the data carpentry phase and have run into a roadblock. I know how to do the following in R, but am required to use python3 due some later machine learning packages I'd like to run. My main dataframe (cleankc_zip
) is 9 columns by 9,062 rows, and zipcode
is the column I want to target to subset.
I have a list of zip codes and I'd like to "subset" my data like I'd do in R.
I have a list n
set up like this: n = [int641, int642,...int64n]
I've tried creating a list and using .loc
and .iloc
to parse through the data, like this: zip_ksmo=cleankc_zip.loc[cleankc_zip['zipcode'] == n]
where zip_ksmo is the variable I'd like to store the new data in and cleankc_zip is the data I'm trying to subset. N is my list of zip codes, as mentioned above.
Upon running the code, I get this error: ValueError: Lengths must match to compare
.
Basically, I would just like to subset my cleankc_zip to include only zip coded contained in my list n
. I'm not very proficient at python and have run into a road block.
On a side note, once I get past here, I'll be good to go.