There are a number of questions here on SO regarding pandas
not respecting the order of the columns when reading/writing a csv file, some of them dating back 5 years ago (!):
- Preserving column order in Python Pandas DataFrame
- Python Pandas read_csv Keep Column Order
- Keeping columns in the specified order when using UseCols in Pandas Read_CSV
According to this answer, this "bug" was fixed with version 0.19.0 but I'm running Python 3.6.4 and pandas
0.22.0 and I still encounter this issue.
Is this a bug that's been around for years or is this just how pandas
work? If so, what's the reasoning behind not preserving column order?
You can reproduce the issue with this csv file and the following code:
import pandas as pd
df = pd.read_csv(
"test.csv", usecols=('Author', 'Title', 'Abstract Note', 'Url'))
print(df)
Notice that the 'Url'
is not positioned last in df
as it should.