1

I imported a csv that had 20 columns. I used the following to create a df to select only the columns that I needed:

df = pd.read_csv("./data/employees.csv", 
    usecols = ['ID','FIRSTNAME', 'LASTNAME','SALARY'],
    index_col = ['ID'])

However, the order displayed when I call df is not the same as the one that I specified when reading the CSV file. Note how I used index_col to change the index. How can I choose the order when I'm first importing the data?

IamWarmduscher
  • 875
  • 2
  • 10
  • 27
  • This post might be useful - https://stackoverflow.com/questions/40024406/keeping-columns-in-the-specified-order-when-using-usecols-in-pandas-read-csv – Sajan Mar 23 '20 at 15:31
  • That was in the documentation but it didn't work because I used index_col to use the ID as the index of the data frame. – IamWarmduscher Mar 23 '20 at 15:38
  • 1
    Try this - `df = pd.read_csv('./data/employees.csv', usecols=['ID', 'FIRSTNAME', 'LASTNAME', 'SALARY'], index_col=['ID'])[['FIRSTNAME', 'LASTNAME', 'SALARY']]` – Sajan Mar 23 '20 at 15:48
  • It looks like the column that you specify in index_col should not be placed in the sort that you specify at the end. Thanks. – IamWarmduscher Mar 23 '20 at 16:35
  • Yes, that is correct. You are welcome ! – Sajan Mar 23 '20 at 17:08

0 Answers0