1

It seems the default for pd.read_csv() is to read in the column names as str. I can't find the behavior documented and thus can't find where to change it.

Is there a way to tell read_csv() to read in the column names as integer?

Or maybe the solution is specifying the datatype when calling pd.DataFrame.to_csv(). Either way, at the time of writing to csv, the column names are integers and that is not preserved on read.

The code I'm working with is loosely related to this (credit):

df = pd.DataFrame(index=pd.MultiIndex.from_arrays([[], []]))
for row_ind1 in range(3):
    for row_ind2 in range(3, 6):
        for col in range(6, 9):
            entry = row_ind1 * row_ind2 * col
            df.loc[(row_ind1, row_ind2), col] = entry

df.to_csv("df.csv")

dfr = pd.read_csv("df.csv", index_col=[0, 1])
print(dfr.loc[(0, 3), 6])       # KeyError
print(dfr.loc[(0, 3), "6"])     # No KeyError
young_souvlaki
  • 1,886
  • 4
  • 24
  • 28
  • "at the time of writing to the csv, the column names are integers and that is not preserved on read." It is a text file so there's no way to preserve data types. I looked at the documentation for `read_csv()` and didn't see anything that could be helpful, so you may want to look at other formats of writing the DataFrame which would preserve data types. – mechanical_meat Oct 26 '21 at 22:33
  • Does this answer your question? [converting column names to integer with read\_csv](https://stackoverflow.com/questions/37243551/converting-column-names-to-integer-with-read-csv) – bers Mar 03 '22 at 12:31

1 Answers1

3

My temporary solution is:

dfr.columns = dfr.columns.map(int)
young_souvlaki
  • 1,886
  • 4
  • 24
  • 28
  • You could also (if the columns are known) overwrite the existing ones by specifying the names `dfr = pd.read_csv("df.csv", index_col=[0, 1], header=0, names=[6, 7, 8])`. However, converting to int after reading probably is the best solution. – Henry Ecker Oct 26 '21 at 22:55