I'm sure someone has asked this before or that I could research a way to do this efficiently but I'm tight on time, and I'm not sure how to word my issue.
I have a data frame of large dimensions but I noticed that for some reason one of my columns has odd numbers.
head(testCA_extract[5])
ZIP_CODE
1 94801
2 94801
3 928034250
4 92714
5 95054
6 94565
from
> head(testCA_extract[2:6])
REPORTING_YEAR STATE_COUNTY_FIPS_CODE COUNTY_NAME ZIP_CODE CITY_NAME
1 1990 06013 CONTRA COSTA 94801 RICHMOND
2 1990 06013 CONTRA COSTA 94801 RICHMOND
3 1990 06059 ORANGE 928034250 ANAHEIM
4 1990 06059 ORANGE 92714 IRVINE
5 1990 06085 SANTA CLARA 95054 SANTA CLARA
6 1990 06013 CONTRA COSTA 94565 PITTSBURG
For anyone unfamiliar the zip codes are suppose to be 5 digits exactly I'm not sure why there are extra digits but it appears that the first 5 numbers regardless of length is the correct zip code.
So I need to either select only the first 5 digits or constrain the variable to the first 5 digits and delete the rest. and then I need that information to go back to it's proper row and column in the DF.