0

I'm trying to fill an Excel sheet using powershell :

0. Declaring Excel object

$Excel = New-Object -ComObject Excel.Application
$Excel.DisplayAlerts = $false
$Excel.ScreenUpdating = $false
$Excel.DisplayStatusBar = $false
$Excel.EnableEvents = $false
$Excel.Visible = $False

1. Reading data from a database table :

$dt1 = New-Object System.Data.Dataset

2. Getting the table :

$dt_table1 = $dt1.Tables[0]

3. Filling Excel file :

for ([Int]$m = 0; $m -lt $dt_table1.Rows.Count; $m++)
{
    for ([Int]$r = 0; $r -lt $dt_table1.Columns.Count; $r++)
    {
        $incolumn = $r + 1;
        $inrow = $inheaderlenght + 2 + $m;
        if($incolumn -gt 2)
        {
            $Workbook.ActiveSheet.Cells.Item($inrow, $incolumn) = [System.Convert]::ToDecimal($dt_table1.Rows[$m].ItemArray[$r])
        }
        else
        {
            $Workbook.ActiveSheet.Cells.Item($inrow, $incolumn) = $dt_table1.Rows[$m].ItemArray[$r].ToString()
        }
    }
}

With a few hundreds of rows the sheet is filling in seconds, the problem is when i got thousands of rows, is very slow, for example to fill 21.500 rows it need 15 min at least.

I'm executing this code in my production server, with 32GB of RAM and an Intel Xeon processor.

I would like to improve the performance, i need to fill an Excel file with 32 sheets and only few sheets have thousands of rows.

UPDATE: I wanted to fill directly an array into the Excel sheet :

$excelArray = New-Object 'object[,]' $dt_table1.Rows.Count, $dt_table1.Columns.Count

$excelArray = ForEach($Row in $dt1.Tables[0].Rows){
            $Record = New-Object PSObject
            ForEach($Col in $dt1.Tables[0].Columns.ColumnName){
                Add-Member -InputObject $Record -NotePropertyName $Col -NotePropertyValue $Row.$Col
            }
            $Record
        }

But now, the next line fails:

$range = $WorkSheet.Range('A1', ([char](64 + $dt_table1.Columns.Count)).ToString() + ($dt_table1.Rows.Count).ToString() )
$range.Value2 = $excelArray
Giuseppe Lolli
  • 167
  • 4
  • 15
  • Excel is slow for this kind of thing, but that sounds too slow. It will be faster from VBA. If it's an SQL query, Excel can do that directly – Charlieface Jan 13 '21 at 17:19
  • I have no example right now and my last tests with that COM-Object are a while ago, but it should be possible to add a complete range of cells at once into Excel instead of doing it cell by cell. Maybe that helps to avoid the overhead. – swbbl Jan 13 '21 at 17:39
  • PS: One important note/question: Which "Windows Management Framework" is installed (PS version)? Should be at least 5.1. Some lower versions had really poor performance when it comes to COM-Objects. – swbbl Jan 13 '21 at 17:54
  • is 5.1.. the idea to spool the entire data table into the sheet is not bad but i cannot find any way to do it.... – Giuseppe Lolli Jan 14 '21 at 08:58
  • This may help: https://stackoverflow.com/questions/60111760/insert-data-from-a-dataset-to-an-excel-table-in-a-worksheet-using-powershell – Alex Jan 14 '21 at 10:50
  • I got the information into an array but i cannot fill the Excel with that array – Giuseppe Lolli Jan 14 '21 at 13:32
  • You don't need Excel to create a real `xlsx` file. And interop is *definitely* slow. Each call is a cross-process call, and this code is making a cross-process for every single cell. – Panagiotis Kanavos Jan 14 '21 at 14:06
  • Try the [Powershell Excel Module](https://devblogs.microsoft.com/scripting/introducing-the-powershell-excel-module-2/) to export data to `xlsx` files withoutt using Excel interop. Each interop call is an expensive cross-process call. The question's code is making a separate call for every single cell, which is extremely slow. – Panagiotis Kanavos Jan 14 '21 at 14:09

2 Answers2

0

Shamelessly using some of f6a4's answer, I think this could work:

# 3. Filling Excel file :

# convert to array of objects
$tableData = $dt_table1 | Select-Object * -ExcludeProperty ItemArray, Table, RowError, RowState, HasErrors

# select the range in the worksheet
$endRange = '{0}{1}' -f ([char](64 + $dt_table1.Columns.Count)), $dt_table1.Rows.Count
$range = $Workbook.ActiveSheet.Range('A1', $endRange)

# copy the array to excel range
$range.Value2 = $tableData
Theo
  • 57,719
  • 8
  • 24
  • 41
  • That's better than the original code but not good. There's no need for Excel to produce an `xlsx` file. It's faster than the original code because it uses only one cross-process call to copy the data, but it still needs Excel to be installed, and *closed* in the end. – Panagiotis Kanavos Jan 14 '21 at 14:15
  • Error HRESULT: 0x800A03EC when executing $range.Value2 = $tableData – Giuseppe Lolli Jan 14 '21 at 14:45
  • @GiuseppeLolli that's a very generic error. Maybe one of the answers [here](https://stackoverflow.com/questions/891394/excel-error-hresult-0x800a03ec-while-trying-to-get-range-with-cells-name) can help you with that. – Theo Jan 14 '21 at 15:14
  • @PanagiotisKanavos I agree I did not add code to open an excel file, saving the result and close/release the COM objects, but the question is about filling a worksheet range using a data table. For brevity, also the OP left that out aswell. – Theo Jan 14 '21 at 15:20
  • That's not the point. An exception handler is required to ensure an error won't leave Excel open. It's easy to end up with multiple ghost Excel instances eating up RAM. And even then, Excel has to be installed *and licensed*. These problems go away if the Excel file is generated directly. Servers don't use Excel Interorp to produce Excel files for all those reasons (and the RAM used by Excel. can't be spared) – Panagiotis Kanavos Jan 14 '21 at 15:22
  • @PanagiotisKanavos Sure, but again.. That is **not** the question here. – Theo Jan 14 '21 at 15:25
0

Ok, now i found solution.

$excelArray | ConvertTo-CSV -NoType -Del "`t" | Select -Skip 1 | Clip
[void]$WorkSheet.columns.Item(1).cells.Item(2).PasteSpecial()

From 15 min to 2 min. This really helps me a lot, i have to produce like 500+ Excel files.

Giuseppe Lolli
  • 167
  • 4
  • 15