0
import pandas as pd
import numpy as np

data = {'Type_of_Institution': [np.nan, 'Institution of Nation Importance', 'Institution of Nation Importance', 'Institution of Nation Importance', 'Private Stand Alone'],
        'City': [np.nan, 'Bangalore', 'Kozhikode', 'Mumbai', 'Navi Mumbai'],
        'State': [np.nan, 'Karnataka', 'Kerala', 'Maharashtra', 'Maharashtra'],
        'Percent_Placed': [np.nan, 100.0, 89.0, 99.0, 75.0],
        'Percent_Placed_Sector': ['MNC', 50, 30, 40, 10],
        'Unnamed: 6': ['FMCG', np.nan, 20, 20, 20],
        'Unnamed: 7': ['Fintech', np.nan, 10, 10, 20],
        'Unnamed: 8': ['Consultancy', 40, 10, 20, 10],
        'Unnamed: 9': ['Financial', 10, 20, 10, 20],
        'Unnamed: 10': ['Technology', np.nan, 10, np.nan, 10],
        'Unnamed: 11': ['Others', np.nan, np.nan, np.nan, 10],
        'Salary': ['Average',  6.11, 5.46, 5.82, 3.6],
        'Unnamed: 13': ['Median', 6, 5, 5.96, 2.8],
        'Unnamed: 14': ['Max', 16, 15.2, 14, 14],
        'Unnamed: 15': ['Min', 3.4, 2.5, 2.84, 1.8]}


df = pd.DataFrame(data)

                Type_of_Institution         City        State  Percent_Placed Percent_Placed_Sector Unnamed: 6 Unnamed: 7   Unnamed: 8 Unnamed: 9 Unnamed: 10 Unnamed: 11   Salary Unnamed: 13 Unnamed: 14 Unnamed: 15
0                               NaN          NaN          NaN             NaN                   MNC       FMCG    Fintech  Consultancy  Financial  Technology      Others  Average      Median         Max         Min
1  Institution of Nation Importance    Bangalore    Karnataka           100.0                    50        NaN        NaN           40         10         NaN         NaN     6.11           6          16         3.4
2  Institution of Nation Importance    Kozhikode       Kerala            89.0                    30         20         10           10         20          10         NaN     5.46           5        15.2         2.5
3  Institution of Nation Importance       Mumbai  Maharashtra            99.0                    40         20         10           20         10         NaN         NaN     5.82        5.96          14        2.84
4               Private Stand Alone  Navi Mumbai  Maharashtra            75.0                    10         20         20           10         20          10          10      3.6         2.8          14         1.8

This the data, I believe a loop can be used to do the change, I tried to create a loop but it was throwing lots of errors. Above is the image of dataframe.

Desired Output

                Type_of_Institution         City        State  Percent_Placed Percent_Placed_Sector  FMCG  Fintech  Consultancy  Financial  Technology  Others   Salary  Median   Max   Min
0                               NaN          NaN          NaN             NaN                   MNC  FMCG  Fintech  Consultancy  Financial  Technology  Others  Average  Median   Max   Min
1  Institution of Nation Importance    Bangalore    Karnataka           100.0                    50   NaN      NaN           40         10         NaN     NaN     6.11       6    16   3.4
2  Institution of Nation Importance    Kozhikode       Kerala            89.0                    30    20       10           10         20          10     NaN     5.46       5  15.2   2.5
3  Institution of Nation Importance       Mumbai  Maharashtra            99.0                    40    20       10           20         10         NaN     NaN     5.82    5.96    14  2.84
4               Private Stand Alone  Navi Mumbai  Maharashtra            75.0                    10    20       20           10         20          10      10      3.6     2.8    14   1.8
Trenton McKinney
  • 56,955
  • 33
  • 144
  • 158
Madfz
  • 11
  • 1

1 Answers1

2
  • Loop through, and check if the column names contain 'Unnamed', and then use pandas.DataFrame.rename with .loc, to change those column names.
  • The following code is done with a for-loop, because a list comprehension would be setting the new column name as a side-effect, which is anti-pythonic.
for col in df.columns:
    if 'Unnamed' in col:
        df.rename(columns={col: df.loc[0, col]}, inplace=True)
  • Use a dict-comprehension to create a dict with all the names to change
new_names = {col: df.loc[0, col] for col in df.columns if 'Unnamed' in col}

# update the names
df = df.rename(new_names, axis=1)
  • Alternatively, assign a new list to df.columns, which can use a list-comprehension.
  • Keep the existing column name, col, if 'Unnamed' isn't in the current name, otherwise replace col with the value from the first row of col, df.loc[0, col].
df.columns = [col if 'Unnamed' not in col else df.loc[0, col] for col in df.columns]
  • df = df.drop(0, axis=0).reset_index(drop=True) will remove row 0, and reset the index.
Trenton McKinney
  • 56,955
  • 33
  • 144
  • 158