-1

How to prevent integers from being converted to floats when converting a data frame to a list?

I have a .csv file with 5 columns of data. The first four columns have no decimal points, while the last column does.

When I import this data into my script using "pd.read_csv", the data imports correctly, with the first 4 numbers as integers and the last as a float, like this:

1,1,10,0,1.0

1,1,11,0,0.6

1,1,12,0,0.0

BUT I need to convert this data into a list, and when I do it converts all the numbers into floats. I do not want this. The first four values need to be integers.

This is my current code, which, after its is converted to a list, provides a list where all numbers are float:

data_file_name = r'C:\Users\username\Desktop\FileName.csv'

data = pd.read_csv(data_file_name)  #<This part works and the data types are correct, the first 4 are integers
data2 = data.values.tolist() #<here is where everything gets converted to a float, even if it was defined as an int in the df.

This results in a list with the data formatted like this:

[[1.0, 1.0, 10.0, 0.0, 1.0], [1.0, 1.0, 11.0, 0.0, 0.6], [1.0, 1.0, 12.0, 0.0, 0.0]]

When I need it to be formatted like this:

[[1, 1, 10, 0, 1.0], [1, 1, 11, 0, 0.6], [1, 1, 12, 0, 0.0]]

What can I do?

I've tried:

[int(i,10) for i in data]

But this returns this error:

ValueError: invalid literal for int() with base 10: 'Month'
Anna
  • 1
  • 3
  • Look at the [dtype](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_csv.html) argument in `pd.read_csv` – Leo Sep 27 '19 at 21:32
  • 1
    @Prune How does that duplicate help with `read_csv`? – Barmar Sep 27 '19 at 21:35
  • The conversion logic is the critical part. – Prune Sep 27 '19 at 21:38
  • I've added in a new step I left out - the csv reading is working fine and the datatypes there are correct, but a decimal is added to the list even for the columns that are defined as ints in the df. – Anna Sep 27 '19 at 22:01
  • I found the answer to my question here, and this question also is asking exactly what I intended to ask: https://stackoverflow.com/questions/34838378/dataframe-values-tolist-datatype – Anna Sep 30 '19 at 19:15

1 Answers1

1

Use the dtype argument to control the datatypes.

pd.read_csv(data_file_name, dtype={0: "int64", 1: "int64", 2: "int64", 3: "int64", 4: "float64"})
Barmar
  • 741,623
  • 53
  • 500
  • 612
  • Thank you, but this does not solve my issue. I've updated the question to be more clear. My data imports just fine, with the first 4 numbers importing as integers, and the last as a float. Its once this data is converted into a list that all numbers are converted to float. – Anna Sep 30 '19 at 17:48
  • See https://stackoverflow.com/questions/34838378/dataframe-values-tolist-datatype – Barmar Sep 30 '19 at 19:18
  • Thanks! I found that answer too, and I added it in the comment above. That answered my question. – Anna Oct 01 '19 at 01:55