2

Let say I have an excel sheet like this,

enter image description here

If I read this file in pandas, I can get Column1, Column2, Column3 as headers.

However, I want to know/create an output possibly a dictionary that is like this,

{Column1: 'A', Column2: 'B', Column3: 'C'}

The reason is I have another dictionary from master mapping file (that already had the references for each column done manually) that has all the references to each Column like this,

{Column1: 'A', Column2: 'B', Column3: 'C', Column4: 'D'}

This way, I can cross check keys and values and then if there is any mismatch, I can identify those mismatches. How can I get the original column name such as A for Column1 etc.. while reading a file into pandas?? Any ideas??

jpp
  • 159,742
  • 34
  • 281
  • 339
user9431057
  • 1,203
  • 1
  • 14
  • 28
  • There is `usecols` in this `pandas` [link](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_excel.html) But does not help much in my case. – user9431057 Jul 09 '18 at 16:37

2 Answers2

1

You can use dict with zip to map column names to letters. Assumes you have a maximum of 26 columns.

from string import ascii_uppercase

df = pd.DataFrame(np.arange(9).reshape(3, 3),
                  columns=['Column1', 'Column2', 'Column3'])

d = dict(zip(df.columns, ascii_uppercase))

print(d)

{'Column1': 'A', 'Column2': 'B', 'Column3': 'C'}

For more than 26 columns, you can adapt the itertools.product solution available in Repeating letters like excel columns?

jpp
  • 159,742
  • 34
  • 281
  • 339
  • 1
    Nice way to think about this. I had no idea `ascii_uppecase` existed! – user9431057 Jul 09 '18 at 17:12
  • I was reading `ascii_uppercase`, my understanding is I am stuck at 26 columns for now unless, there is a way to concatenate `ascii_uppercase` characters to go more than 26 columns. i.e. In excel, after `Z`, it is `AA`, `AB` etc.. – user9431057 Jul 09 '18 at 17:51
0

You can use the Panadas rename method to replace the dataframe column names using your existing mapping dictionary:

https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.rename.html

import pandas as pd

df = pd.DataFrame({'Column1': [1, 2], 'Column2': [3, 4], 'Column3': [5, 6]})

existing_mapping = {'Column1': 'A', 'Column2': 'B', 'Column3': 'C', 'Column4': 'D'}

df = df.rename(columns=existing_mapping) 
EnTwiZle
  • 23
  • 4