I am trying to process very large files (10,000+ observsstions) where zip codes are not easily formatted. I need to convert them all to just the first 5 digits, and here is my current code:
def makezip(frame, zipcol):
i = 0
while i < len(frame):
frame[zipcol][i] = frame[zipcol][i][:5]
i += 1
return frame
frame is the dataframe, and zipcol is the name of the column containing the zip codes. Although this works, it takes a very long time to process. Is there a quicker way?