How to prevent integers from being converted to floats when converting a data frame to a list?

Question

I have a .csv file with 5 columns of data. The first four columns have no decimal points, while the last column does.

When I import this data into my script using "pd.read_csv", the data imports correctly, with the first 4 numbers as integers and the last as a float, like this:

1,1,10,0,1.0

1,1,11,0,0.6

1,1,12,0,0.0

BUT I need to convert this data into a list, and when I do it converts all the numbers into floats. I do not want this. The first four values need to be integers.

This is my current code, which, after its is converted to a list, provides a list where all numbers are float:

data_file_name = r'C:\Users\username\Desktop\FileName.csv'

data = pd.read_csv(data_file_name)  #<This part works and the data types are correct, the first 4 are integers
data2 = data.values.tolist() #<here is where everything gets converted to a float, even if it was defined as an int in the df.

This results in a list with the data formatted like this:

[[1.0, 1.0, 10.0, 0.0, 1.0], [1.0, 1.0, 11.0, 0.0, 0.6], [1.0, 1.0, 12.0, 0.0, 0.0]]

When I need it to be formatted like this:

[[1, 1, 10, 0, 1.0], [1, 1, 11, 0, 0.6], [1, 1, 12, 0, 0.0]]

What can I do?

I've tried:

[int(i,10) for i in data]

But this returns this error:

ValueError: invalid literal for int() with base 10: 'Month'

Look at the [dtype](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_csv.html) argument in `pd.read_csv` — Leo, Sep 27 '19 at 21:32
I've added in a new step I left out - the csv reading is working fine and the datatypes there are correct, but a decimal is added to the list even for the columns that are defined as ints in the df. — Anna, Sep 27 '19 at 22:01
I found the answer to my question here, and this question also is asking exactly what I intended to ask: https://stackoverflow.com/questions/34838378/dataframe-values-tolist-datatype — Anna, Sep 30 '19 at 19:15

score 1 · Answer 1 · answered Sep 27 '19 at 21:36

1

Use the dtype argument to control the datatypes.

pd.read_csv(data_file_name, dtype={0: "int64", 1: "int64", 2: "int64", 3: "int64", 4: "float64"})

answered Sep 27 '19 at 21:36

Barmar

741,623
53
500
612

Thank you, but this does not solve my issue. I've updated the question to be more clear. My data imports just fine, with the first 4 numbers importing as integers, and the last as a float. Its once this data is converted into a list that all numbers are converted to float. – Anna Sep 30 '19 at 17:48
See https://stackoverflow.com/questions/34838378/dataframe-values-tolist-datatype – Barmar Sep 30 '19 at 19:18
Thanks! I found that answer too, and I added it in the comment above. That answered my question. – Anna Oct 01 '19 at 01:55

How to prevent integers from being converted to floats when converting a data frame to a list?

1 Answers1