-2

I have 3 folders of excel data and was asked to create a Machine Learning model using that data. But the problem is that the data does not have headers.

How to import all those folders of data in Python.

desertnaut
  • 57,590
  • 26
  • 140
  • 166
  • Can you show us the folder structure, the data structure, and possibly if you've tried anything? – CumminUp07 Nov 12 '22 at 15:15
  • Thanks for your comment. Actually I havent tried anything. I was provided 3 folders containing excel files. In every folder there are 20 excel files and all those files contains the data & the data does not have any column name which makes it difficult for me to know which variable to utilise or not. – Sounak Sarkar Nov 12 '22 at 15:23

1 Answers1

0

Python won't tell you the name of the columns. What python can do is help you import and/or concatenate easily all of the excels.

In order to import them massively:

import os
import pandas as pd

# List files in an specific folder
os.listdir(source_directory)

# Set source and destination directories
source_directory = "xx"

# Open all files and assign them to a variable whose name will be df + name of file
for file in os.listdir(source_directory):
    file_name = file.split(".")[0]
    name = "df_" + file_name
    vars()[name] = pd.read_excel(f"{source_directory}/{file}")

You could as well use another loop to read data in every directory you need

In case you need to concatenate all of the excels, suppossing them have the same structure, you could use pandas append, then it would be something like this:

df = pd.DataFrame()
for file in os.listdir(source_directory):
    file_name = file.split(".")[0]
    df.append(pd.read_excel(f"{source_directory}/{file}"))

Regarding how to add a header row on the files, here is an answer

  • 1
    Thank you so much for the answer. I was having a real difficulty solving this probelm and ig this can help me solve the problem. – Sounak Sarkar Nov 12 '22 at 16:15