0

I cannot load multiple excel files from a directory in only one Dataframe. I have tried two different ways and both do no work.

Gives me this error.

How can I solve the problem? It does find the files when creates the list, but than cannot open it in the Dataframe. Any hints ?

import pandas as pd
import os
import glob
import xlrd

cwd = os.getcwd()
cwd

path = '/Users/giovanni/Desktop/news media'
files = os.listdir(path)
files


files_xls = [f for f in files if f[-3:] == 'lsx']
files_xls





df = pd.DataFrame()

for f in files_xls:
    data = pd.read_excel(f)
    df = df.append(data)

FileNotFoundError: [Errno 2] No such file or directory: 'NOV.xlsx'
Nihal
  • 5,262
  • 7
  • 23
  • 41
  • are you using windows? – Nihal Mar 18 '19 at 11:41
  • You need to provide a little bit more information: Are all the excel files in the same format? (ie same sheet name with data, same columns). – Nidal Mar 18 '19 at 11:42
  • check this, it serves your purpose better: https://stackoverflow.com/questions/28669482/appending-pandas-dataframes-generated-in-a-for-loop – anky Mar 18 '19 at 11:43
  • I am using MAC. All the files are in the same format It does find all the list in the directory, so when run files_xls all the names appear as output, but than gives me this error – jonny Bravo Mar 18 '19 at 11:48
  • Please Change Spelling of your file Extention, which should be 'xls', not 'lsx' – Ghanshyam Savaliya Mar 18 '19 at 11:55
  • @Nidal all the excel files in the same format, same sheet name with same data and same columns – jonny Bravo Mar 18 '19 at 11:55
  • @Ghanshyam that's only to find the list of files that I wanna upload – jonny Bravo Mar 18 '19 at 11:56
  • @jonny Bravo, There should be path = os.getcwd() and then run the code(Make sure that all your required file available in your cwd), currently, the code is searching your file in the default directory. – Ghanshyam Savaliya Mar 18 '19 at 12:17

2 Answers2

2

Try this:

import os
import glob
path = '/Users/giovanni/Desktop/news media'
df = pd.DataFrame()
for file in glob.glob(os.path.join(path,'*.xlsx')):
    data = pd.read_excel(file)
    print(data)
    df = df.append(data)
Loochie
  • 2,414
  • 13
  • 20
1

Replace your final loop with:

for f in files_xls:
    full_path = os.path.join(path, f)
    data = pd.read_excel(full_path) 
    df = df.append(data)
Nidal
  • 415
  • 3
  • 11