1

I am trying to import data from a github site that is in csv form with pandas. It seems to be working fine, except that the column "ZIP" is not importing with all digits. There should be 5 digits for a zip code, but it seems that 1 or 2 digits from the front are being omitted. Why?!

I want to import this data:

coords=pd.read_csv('https://gist.githubusercontent.com/erichurst/7882666/raw/5bdc46db47d9515269ab12ed6fb2850377fd869e/US%2520Zip%2520Codes%2520from%25202013%2520Government%2520Data')
coords.head(5)

For some reason it looks like this, but the zip SHOULD be 00601

    ZIP     LAT         LNG
0   601     18.180555   -66.749961
rafaelc
  • 57,686
  • 15
  • 58
  • 82
Dr.Data
  • 167
  • 1
  • 10

1 Answers1

2

The reason is because pandas automatically infers the dtype of your columns and ends up assigning integer dtype for the ZIP column, since it is composed by numbers only.

You have to explicitly state that they are strings otherwise 00601 will just be 601

You can do that by using thedtypes argument in read_csv

pd.read_csv(file, dtype={'ZIP': str})
rafaelc
  • 57,686
  • 15
  • 58
  • 82