5

I am reading a CSV file into pandas:

df = pd.read_csv('file.csv')

However, I notice that column order is not preserved. I can't find anything in the docs that explain how to keep the column order when reading in the CSV file.

slaw
  • 6,591
  • 16
  • 56
  • 109
  • 2
    Can you post raw input data to support this claim, I've never experienced a situation where the column order is not preserved – EdChum Jun 30 '15 at 07:34
  • I experienced what I thought was this problem but realised I was comparing against an array initialised with a dict e.g. `pd.DataFrame({'a': [1], 'b': [2]})` ... the dict not the DataFrame being the source of the non-determinism – joel Jul 09 '19 at 13:12

2 Answers2

0

read.csv convert tabular data into DataFrame object. Since DataFrame are kwargs dictionary, order is not persevered.

Sources from DataFrame and read.csv

Leb
  • 15,483
  • 10
  • 56
  • 75
-2

you can pass list of the columns with the order you would like them to have:

columns = ['a', 'x', 'b', 'y']
df = pd.read_csv('file.csv', usecols=columns)
Vlad Bezden
  • 83,883
  • 25
  • 248
  • 179
  • one might want to infer the column names from the csv, and retain the ordering in the csv – joel Jul 09 '19 at 10:54
  • [Documentation of`usecols`](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_csv.html#pandas-read-csv) explicitly says that "Element order is ignored, so usecols=[0, 1] is the same as [1, 0]." – MaxPowers Jan 06 '23 at 15:19