0

I'm new to groovy, and I've used the ExcelBuilder code referenced below to iterate through an excel spreadsheet to grab data. Is there an easy to write data as I iterate?

For example, row 1 might have data like this (CSV):

value1,value2

And after I iterate, I want it to look like this:

value1,value2,value3

http://www.technipelago.se/content/technipelago/blog/44

Adam Zalcman
  • 26,643
  • 4
  • 71
  • 92
Scott Offen
  • 6,933
  • 3
  • 21
  • 24
  • It shouldn't be hard to enhance the builder so that it updates data in a spreadsheet in place. But it depends on what exactly you want to do. Do you want to introduce a new column ? Please provide more details. – Kai Sternad Jan 29 '12 at 10:06
  • I'm sure that you can and would be surprised if you need to enhance the builder to achieve it. Could you post some code and we'll see what we can do? – Bob Kuhar Jan 29 '12 at 21:30
  • I don't actually have any code other than what is posted on that blog. I'll try to mock something up and post it for comment. – Scott Offen Jan 30 '12 at 23:13

1 Answers1

5

Yes, this can be done! As I got into the guts of it, I realized that the problem I was trying to solve wasn't the same as trying to read and write from the same file at the same time, but rather the excel data was stored in an object that I could freely manipulate whenever I wanted. So I added methods specific to my needs - which may or many not meet the needs of anyone else - and I post them here for smarter people to pick apart. At the end of it all, it is now doing what I want it to do.

I added a cell method that takes an index (number or label) and a value which will update a cell for the current row in context (specifically while using .eachLine()), and a .putRow() method that adds a whole row to the spreadsheet specified. It also handles Excel 2003, 2007, and 2010 formats. When files, sheets, or cells referenced don't exist, they get created. Since my source spreadsheets often have formulas and charts ready to display the data I'm entering, the .save() and .saveAs() methods call .evaluateAllFormulaCells() before saving.

To see the code I started with and examples of how it works, check out this blog entry

Note that both the .save() and .saveAs() methods reload the workbook from the saved file immediately after saving. This is a workaround to a bug in Apache POI that doesn't seem to be fixed yet (see Exception when writing to the xlsx document several times using apache poi).

import groovy.lang.Closure;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import java.util.Map;
import org.apache.poi.ss.usermodel.WorkbookFactory;
import org.apache.poi.ss.usermodel.DataFormatter;
import org.apache.poi.xssf.usermodel.XSSFWorkbook;
import org.apache.poi.xssf.usermodel.XSSFSheet;
import org.apache.poi.xssf.usermodel.XSSFRow;
import org.apache.poi.xssf.usermodel.XSSFCell;
import org.apache.poi.ss.usermodel.DateUtil;
import org.apache.poi.xssf.usermodel.XSSFFormulaEvaluator;
import org.apache.poi.hssf.usermodel.HSSFWorkbook;
import org.apache.poi.hssf.usermodel.HSSFSheet;
import org.apache.poi.hssf.usermodel.HSSFRow;
import org.apache.poi.hssf.usermodel.HSSFCell;
import org.apache.poi.hssf.usermodel.HSSFDateUtil;
import org.apache.poi.hssf.usermodel.HSSFFormulaEvaluator;

class Excel
{
    def workbook;
    def sheet;
    def labels;
    def row;
    def infilename;
    def outfilename;

    Excel(String fileName)
    {
        HSSFRow.metaClass.getAt = {int index ->
            def cell = delegate.getCell(index);
            if(! cell)
            {
                return null;
            }

            def value;

            switch (cell.cellType)
            {
                case HSSFCell.CELL_TYPE_NUMERIC:
                    if(HSSFDateUtil.isCellDateFormatted(cell))
                    {
                        value = cell.dateCellValue;
                    }
                    else
                    {
                        value = new DataFormatter().formatCellValue(cell);
                    }
                    break;
                case HSSFCell.CELL_TYPE_BOOLEAN:
                    value = cell.booleanCellValue
                    break;
                default:
                    value = new DataFormatter().formatCellValue(cell);
                    break;
            }

            return value
        }

        XSSFRow.metaClass.getAt = {int index ->
            def cell = delegate.getCell(index);
            if(! cell)
            {
                return null;
            }
            def value = new DataFormatter().formatCellValue(cell);

            switch (cell.cellType)
            {
                case XSSFCell.CELL_TYPE_NUMERIC:
                    if (DateUtil.isCellDateFormatted(cell))
                    {
                        value = cell.dateCellValue;
                    }
                    else
                    {
                        value = new DataFormatter().formatCellValue(cell);
                    }
                    break;
                case XSSFCell.CELL_TYPE_BOOLEAN:
                    value = cell.booleanCellValue
                    break;
                default:
                    value = new DataFormatter().formatCellValue(cell);
                    break;
            }

            return value;
        }

        infilename = fileName;
        outfilename = fileName;

        try
        {
            workbook = WorkbookFactory.create(new FileInputStream(infilename));
        }
        catch (FileNotFoundException e)
        {
            workbook = (infilename =~ /(?is:\.xlsx)$/) ?  new XSSFWorkbook() : new HSSFWorkbook();
        }
        catch (Exception e)
        {
            e.printStackTrace();
        }
    }

    def getSheet(index)
    {
        def requested_sheet;

        if(!index) index = 0;

        if(index instanceof Number)
        {
            requested_sheet = (workbook.getNumberOfSheets >= index) ? workbook.getSheetAt(index) : workbook.createSheet();
        }
        else if (index ==~ /^\d+$/)
        {
            requested_sheet = (workbook.getNumberOfSheets >= Integer.valueOf(index)) ? workbook.getSheetAt(Integer.valueOf(index)) : workbook.createSheet();
        }
        else
        {
            requested_sheet = (workbook.getSheetIndex(index) > -1) ? workbook.getSheet(index) : workbook.createSheet(index);
        }

        return requested_sheet;
    }

    def cell(index)
    {
        if (labels && (index instanceof String))
        {
            index = labels.indexOf(index.toLowerCase());
        }

        if (row[index] == null)
        {
            row.createCell(index);
        }

        return row[index];
    }

    def cell(index, value)
    {
        if (labels.indexOf(index.toLowerCase()) == -1)
        {
            labels.push(index.toLowerCase());

            def frow  = sheet.getRow(0);
            def ncell = frow.createCell(labels.indexOf(index.toLowerCase()));
            ncell.setCellValue(index.toString());
        }

        def cell = (labels && (index instanceof String)) ? row.getCell(labels.indexOf(index.toLowerCase())) : row.getCell(index);

        if (cell == null)
        {
            cell = (index instanceof String) ? row.createCell(labels.indexOf(index.toLowerCase())) : row.createCell(index);
        }

        cell.setCellValue(value);
    }

    def putRow (sheetName, Map values = [:])
    {
        def requested_sheet = getSheet(sheetName);
        if (requested_sheet)
        {
            def lrow;
            if (requested_sheet.getPhysicalNumberOfRows() == 0)
            {
                lrow = requested_sheet.createRow(0);
                def lcounter = 0;
                values.each {entry->
                    def lcell = lrow.createCell(lcounter);
                    lcell.setCellValue(entry.key);
                    lcounter++;
                }
            }
            else
            {
                lrow = requested_sheet.getRow(0);
            }

            def sheetLabels = lrow.collect{it.toString().toLowerCase()}
            def vrow = requested_sheet.createRow(requested_sheet.getLastRowNum() + 1);
            values.each {entry->
                def vcell = vrow.createCell(sheetLabels.indexOf(entry.key.toLowerCase()));
                vcell.setCellValue(entry.value);
            }
        }
    }

    def propertyMissing(String name)
    {
        cell(name);
    }

    def propertyMissing(String name, value)
    {
        cell(name, value);
    }

    def eachLine (Map params = [:], Closure closure)
    {
        /*
         * Parameters:
         * skiprows    : The number of rows to skip before the first line of data and/or labels
         * offset      : The number of rows to skip (after labels) before returning rows
         * max         : The maximum number of rows to iterate
         * sheet       : The name (string) or index (integer) of the worksheet to use
         * labels      : A boolean to treat the first row as a header row (data can be reference by label)
         *
         */
        def skiprows = params.skiprows ?: 0;
        def offset = params.offset ?: 0;
        def max = params.max ?: 9999999;
        sheet = getSheet(params.sheet);
        def rowIterator = sheet.rowIterator();
        def linesRead = 0;

        skiprows.times{ rowIterator.next() }

        if(params.labels)
        {
            labels = rowIterator.next().collect{it.toString().toLowerCase()}
        }

        offset.times{ rowIterator.next() }

        closure.setDelegate(this);

        while(rowIterator.hasNext() && linesRead++ < max)
        {
            row = rowIterator.next();
            closure.call(row);
        }
    }

    def save ()
    {
        if (workbook.getClass().toString().indexOf("XSSF") > -1)
        {
            XSSFFormulaEvaluator.evaluateAllFormulaCells((XSSFWorkbook) workbook);
        }
        else
        {
            HSSFFormulaEvaluator.evaluateAllFormulaCells((HSSFWorkbook) workbook);
        }

        if (outfilename != null)
        {
            try
            {
                FileOutputStream output = new FileOutputStream(outfilename);
                workbook.write(output);
                output.close();

                workbook = null;
                workbook = WorkbookFactory.create(new FileInputStream(outfilename));
            }
            catch (IOException e)
            {
                e.printStackTrace();
            }
        }
    }

    def saveAs (String fileName)
    {
        if (workbook.getClass().toString().indexOf("XSSF") > -1)
        {
            XSSFFormulaEvaluator.evaluateAllFormulaCells((XSSFWorkbook) workbook);
        }
        else
        {
            HSSFFormulaEvaluator.evaluateAllFormulaCells((HSSFWorkbook) workbook);
        }

        try
        {
            FileOutputStream output = new FileOutputStream(fileName);
            workbook.write(output);
            output.close();
            outfilename = fileName;

            workbook = null;
            workbook = WorkbookFactory.create(new FileInputStream(outfilename));
        }
        catch (IOException e)
        {
            e.printStackTrace();
        }
    }
}

If you see any glaring errors or ways to improve (other than style), I'd love to hear them. Again, Groovy is not a language I have much experience with, and I haven't done anything with Java in several years, so I'm sure there might be some better ways to do things.

Community
  • 1
  • 1
Scott Offen
  • 6,933
  • 3
  • 21
  • 24
  • Just two comments so far: your first assignment to value in XSSFRow is the default case in your switch. And I don't understand why you're printing to System.out in putRow. Other than that: great work! – mabi Apr 09 '13 at 09:24
  • Good catch on the System.out, most likely just some leftover debugging/testing code. I edited that out. – Scott Offen Apr 22 '13 at 17:20