0

Have read all of the threads on this but I'm still hitting a dead end. Just trying to take all the csv's in a directory and add them as new sheets to a new xlsx workbook. Here's what I've got:

import xlwt, csv, os, glob

def make_excel_workbook(path):
    wb = xlwt.Workbook()
    for filename in os.listdir(folder_path):
        if filename.endswith('.csv'):
            ws = wb.add_sheet(os.path.splitext(filename)[0])
            with open('{}\\{}'.format(folder_path, filename), 'rb') as csvfile:
                reader = csv.reader(csvfile, delimiter=',')
                for rowx, row in enumerate(reader):
                    for colx, value in enumerate(row):
                        ws.write(rowx, colx, value)
    return wb

csvDir = "C:\\Temp\\Data\\outfiles"
outDir = "C:\\Temp\\Data\\output"

os.chdir(csvDir)
csvFileList = []
searchTerm = "character string"

for file in glob.glob('*.csv'):
    csvFileList.append(file)

for i in csvFileList: # search a set of extant csv files for a string and make new csv files filtered on the search term
    csv_file = csv.reader(open(i, 'rb'), delimiter=',')
    rowList = []
    for row in csv_file:
        for field in row:
            if searchTerm in field:
                rowList.append(row)
    outputCsvFile = os.path.join(rootDir, i)
    with open(outputCsvFile, 'wb') as newCsvFile:
        wr = csv.writer(newCsvFile, quoting=csv.QUOTE_ALL)
        wr.writerows(rowList)

So far, it works, and creates the new csv files from the original, much larger ones. Here's where it breaks:

if __name__ == '__main__':
    xls = make_excel_workbook(outDir)
    xls_name = "My_Team_Tasks"
    xls.save('{}\\{}{}.'format(outDir, xls_name, '.xls'))
    print('{}\\{}{} saved successfully'.format(outDir, xls_name, '.xls'))

when it gets to xls.save, it gives me the following error:

Update: here's the entire traceback:

Traceback (most recent call last):
    File"M:/Testing/scripts/csv_parse.py", line 44, in <module>
        xls.save('{}\\{}{}'.format(rootDir, xls_name, '.xls'))
    File "C:\Python27\ArcGIS10.4\lib\site-packages\xlwt\Workbook.py", line 696, in save
        doc.save(filename_or_stream, self.get_biff_data())
    File "C:\Python27\ArcGIS10.4\lib\site-packages\xlwt\Workbook.py", line 660, in get_biff_data
        shared_str_table   = self.__sst_rec()
    File "C:\Python27\ArcGIS10.4\lib\site-packages\xlwt\Workbook.py", line 662, in __sst_rec
        return self.__sst.get_biff_record()
    File "C:\Python27\ArcGIS10.4\lib\site-packages\xlwt\BIFFRecords.py", line 77, in get_biff_record
        self._add_to_sst(s)
    File "C:\Python27\ArcGIS10.4\lib\site-packages\xlwt\BIFFRecords.py", line 92, in _add_to_sst
        u_str = upack2(s, self.encoding)
    File "C:\Python27\ArcGIS10.4\lib\site-packages\xlwt\UnicodeUtils.py", line 50, in upack2
        us = unicode(s, encoding)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 in position 69: ordinal not in range (128)
martineau
  • 119,623
  • 25
  • 170
  • 301
auslander
  • 507
  • 2
  • 5
  • 14

1 Answers1

0

Do you know how the input CSV files are encoded? It appears from the error message to be unicode?

You can try:

wb = xlwt.Workbook(encoding='utf-8')

Failing that, as per this answer (xlwt module - saving xls unicode error) it seems another possible way to get around this issue is to encode your text into unicode before writing out.

ws.write(rowx, colx, value.decode('utf-8'))

Again, it depends on how your inputs are encoded.

Qichao Zhao
  • 191
  • 7
  • Mr. Quichao, the second suggestion, writing the rows out as UTF-8, worked perfectly! I am now getting exactly the output I expect. Many thanks. – auslander Nov 28 '16 at 15:05