1

Super newbie Python/Pandas question.

I am trying to read a folder of Excel workbooks with multiple sheets, extract the column headers as a list, and add each list as a new column in a pandas DataFrame. Here's the code I have so far:

import pandas as pd
import numpy as np

filepath = '/content/data.xlsx'

workbook = pd.read_excel(filepath, None, nrows=0)

variables_frame = []

for sheet_name, sheet in workbook.items():
  variables = sheet.columns
  variables_list = list(variables)
  variables_frame = pd.DataFrame.insert(sheet+1, sheet_name, [variables_list])
  print(variables_frame)

However, I get the error "TypeError: insert() missing 1 required positional argument: 'value'" when I try to run this. Any ideas why?

Additionally, if this is not the right way to go about this I'd appreciate any more general feedback. Thank you!

Sourav Dutta
  • 145
  • 8

1 Answers1

0

Problem is that you are calling insert function from class directly

class A:
    def insert(self, x):
        print(f'insert {x}')

# Right
A().insert(3)

# Wrong
A.insert(3)

You may want

variables_frame = pd.DataFrame()
for sheet_name, sheet in workbook.items():
  variables = sheet.columns
  variables_list = list(variables)
  variables_frame.insert(sheet+1, sheet_name, [variables_list])
  print(variables_frame)
Ynjxsjmh
  • 28,441
  • 6
  • 34
  • 52
  • Thanks so much for the reply! I'm wanting to use the Pandas `pandas.DataFrame.insert` though, and I think your answer is using the default Python `.insert` ... correct me if I'm wrong. – traitortots Jun 03 '22 at 18:12
  • @CalebSmith `insert` is a method defined in class [`DataFrame`](https://github.com/pandas-dev/pandas/blob/v1.4.2/pandas/core/frame.py#L4381-L4445) just like the example `insert` is a method defined in class `A`. – Ynjxsjmh Jun 03 '22 at 18:13