0

I am learning Python on the go and trying to use pandas for the first time as well. I have a directory with about 50 excel workbooks I am trying to combine into one.

import openpyxl
import pandas as pd
import numpy as np
import glob
import os
import sys

#path = "\\\\mtrjesmith\\Service Parts Photography Project\\STERISForms"
files = os.listdir("\\\\mtrjesmith\\Service Parts Photography Project\\STERISForms")
outf = "C:\\Python27\\Scripts\\steris_forms\\compiled.xls", "w+b"
#print(files)

frame = [x.parse(x.sheet_names[0], header=None,index_col=None) for x in files]
frame[1:] = [df[1:] for df in frame[1:]]
combined = pd.concat(frame)
combined.to_excel("C:\\Python27\\Scripts\\steris_forms\\compiled.xls", "w+b", header=False, index=False)

I get the following error:

Traceback (most recent call last):
File "C:\Python27\Scripts\steris_forms.py", line 18, in <module>
    frame = [x.parse(x.sheet_names[0], header=None,index_col=None) for x in files]
AttributeError: 'str' object has no attribute 'parse'

What can I do to solve this? Any other feedback would be greatly appreciated.

martineau
  • 119,623
  • 25
  • 170
  • 301
Jason Kral
  • 13
  • 1
  • 6
  • 6
    The problem is with `x.parse`: `x` is a string and does not have a parse method. – brianpck Dec 21 '16 at 18:43
  • What are you expecting `x.parse` to do? – Daniel Roseman Dec 21 '16 at 18:44
  • try pd.ExcelFile.parse – Steve Dec 21 '16 at 18:45
  • I am wondering about this line -- outf = "C:\\Python27\\Scripts\\steris_forms\\compiled.xls", "w+b" – Dinesh Pundkar Dec 21 '16 at 18:46
  • The goal is to parse the xls sheets for the data, pass it along, then concatenate it and write it to a new xls. – Jason Kral Dec 21 '16 at 18:47
  • @JasonKral - Check this - http://stackoverflow.com/questions/25400240/using-pandas-combining-merging-2-different-excel-files-sheets – Dinesh Pundkar Dec 21 '16 at 18:49
  • 1
    @DineshPundkar I will check it out and see if I can figure it out. – Jason Kral Dec 21 '16 at 18:52
  • @Steve that worked, but then I get the same error on '(x.sheet_name[0]' – Jason Kral Dec 21 '16 at 18:53
  • @DineshPundkar I followed the link you provided and used the same code, but I keep getting an IO permissions error on the line 'date = pd.read_excel(f, "Sheet2")' I verified I have full permissions on the directory. I also tried setting up a test folder with full permissions and I get the same error. The permissions error is more confusing as to why I'm getting it that my original issue.... – Jason Kral Dec 22 '16 at 16:12
  • @JasonKral -- Before executing code, please make sure that excel sheet you are trying to read is closed. – Dinesh Pundkar Dec 22 '16 at 16:26
  • @DineshPundkar the first file in the directory was, in fact, open. Unfortunately even after I close it I get the same permissions error. And just to confirm, I verified I had NO Excel files open, and also made sure I closed all folders as well... – Jason Kral Dec 22 '16 at 16:41
  • @DineshPundkar I found the issue with the permissions. I had the variable "path" to be the file path of the directory that contains the spreadsheets since I need to parse them all. If I set it to just look at one sheet in the path I have no issues. How can I load the entire directory and look at it that way, since I have many files? – Jason Kral Dec 22 '16 at 17:58

2 Answers2

1

Try this:

frame = [pd.read_excel(x, header=None, index_col=None) for x in files]
Steve
  • 1,250
  • 11
  • 25
  • that cleared up the issue, but now getting another: TypeError: unbound method parse() must be called with ExcelFile instance as first argument (got str instance instead) on the same line. I assume it is seeing "x" as a str... – Jason Kral Dec 22 '16 at 16:09
0

you just need to import parse as a method of urllib library:

import urllib.parse
sinammd
  • 122
  • 1
  • 1
  • 9