2

I am trying to rename selected columns (say the two las columns) in my data frame using the iloc and df.columns functions but it does not seem to work for me and I can't figure out why. Here is a toy example of what I want to achieve:

import pandas as pd

d = {'one': list(range(5)),
     'two': list(range(5)),
     'three': list(range(5)),
     'four': list(range(5)),
     'five': ['a', 'b', 'c', 'd', 'e'],
     'six': ['a', 'b', 'c', 'd', 'e']
    }
df = pd.DataFrame(d)

df.iloc[:,-2:].columns = ['letter_1','letter_2']

df.columns

but I keep getting the original columns' names:

Index(['one', 'two', 'three', 'four', 'five', 'six'], dtype='object')

What am I missing?

Justyna
  • 737
  • 2
  • 10
  • 25
  • 2
    `df.columns=list(df.iloc[:,:-2].columns)+['letter_1','letter_2']` – anky Jun 27 '19 at 11:31
  • NOTICE: "the two last columns" of a df created by a `dict` is not well defined in python versions earlier than 3.7 (or 3.6 as a CPython impl detail), as dicts prior to that did not grantee order of keys. – Adam.Er8 Jun 27 '19 at 11:34
  • @Adam.Er8 I thought that once I convert a dictionary to data frame the locations of columns are fixed. If that is not the case, how can `iloc` work? – Justyna Jun 27 '19 at 11:36
  • it won't work on `df.columns` for a df created from a dict in Python 3.5 and earlier. Specifically in CPython 3.5, you'll get the columns sorted by alphabetical order – Adam.Er8 Jun 27 '19 at 11:39
  • 1
    @anky_91 That worked! I guess I can't change the names by subseting data frame like I tried but overwriting the names I want to change works just fine. Thank you! – Justyna Jun 27 '19 at 11:39
  • @Furqan Hashim I saw the question you linked as possible duplicate but as far as I can tell all answers there were relying on using names in one way or another and I needed a solution that uses only columns location (unless I missed something). – Justyna Jun 27 '19 at 11:44

1 Answers1

1

just use df.rename:

import pandas as pd

d = {'one': list(range(5)),
     'two': list(range(5)),
     'three': list(range(5)),
     'four': list(range(5)),
     'five': ['a', 'b', 'c', 'd', 'e'],
     'six': ['a', 'b', 'c', 'd', 'e']
     }

new_names = ['letter_1', 'letter_2']
df = pd.DataFrame(d).rename(index=str, columns=dict(zip(list(d.keys())[-len(new_names):], new_names)))

print(df.columns)

Output:

Index(['letter_1', 'four', 'one', 'letter_2', 'three', 'two'], dtype='object')

Adam.Er8
  • 12,675
  • 3
  • 26
  • 38
  • That won't work - it requires me to use the names of the columns I want to rename and there is quite a few of them - that's why I am trying to use the location-based renaming. I want to rename a couple of last columns with long names and for multiple data frames so I don't want to use the names directly. – Justyna Jun 27 '19 at 11:33
  • @Justyna then what about `df.rename(columns=dict(zip(df.columns[-2:],['letter_1','letter_2'])))` – anky Jun 27 '19 at 11:36
  • OK, edited my answer, this will infer the amount of columns to replace by the length of the `new_names` list – Adam.Er8 Jun 27 '19 at 11:36
  • @Adam.Er8 The slice should be `[-len(new_names):]` -- then it will do what I need. Thank you! – Justyna Jun 27 '19 at 11:41
  • 1
    Oh, correct... my bad :) – Adam.Er8 Jun 27 '19 at 11:42
  • I would also use `df_main.columns` rather than `d.keys()` because the data frame I work with is modified after I import the raw dictionary data. – Justyna Jun 27 '19 at 12:20