8

I am looping through Sheet 1 of an Excel file which contains 3 columns with 100s of rows of data (strings) and comparing each cell in a row against rows of combinations in Sheet 2.

Checking should start using Sheet 1 row by row, seeing if each cell's value in the row matches anywhere what is in Sheet 2, row by row. If a check fails, then further checks on that row should cease and the next row to be checked commences. Cells in Sheet 1 that couldn't be matched should be marked red.

My code below is close to what I need, but throws an error if there are 2 or more cells in a row of Sheet 1 (E.g. Row 1: B2 and B3) which don't match anything in any row of Sheet 2.

Error:

(node:9040) UnhandledPromiseRejectionWarning: Error: Invalid Address: Aundefined
    at Object.validateAddress (C:\node_modules\exceljs\dist\es5\utils\col-cache.js:86:13)
    at new module.exports (C:\node_modules\exceljs\dist\es5\doc\cell.js:29:12)
    at module.exports.getCellEx (C:\node_modules\exceljs\dist\es5\doc\row.js:55:14)
    at module.exports.getCell (C:\node_modules\exceljs\dist\es5\doc\row.js:72:41)
    at C:\so.js:56:61
    at C:\node_modules\exceljs\dist\es5\doc\worksheet.js:370:11
    at Array.forEach (<anonymous>)
    at module.exports.eachRow (C:\node_modules\exceljs\dist\es5\doc\worksheet.js:368:18)
    at C:\so.js:16:19
    at <anonymous>
(node:9040) UnhandledPromiseRejectionWarning: Unhandled promise rejection. This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). (rejection id: 1)
(node:9040) [DEP0018] DeprecationWarning: Unhandled promise rejections are deprecated. In the future, promise rejections that are not handled will terminate the Node.js process with a non-zero exit code.

Example data:

Sheet 1:

| COL A | COL B  | COL C  |
|-------|--------|--------|
| bob   | one    | silver |
| bob   | eleven | blue   |
| bob   | eleven | red    |
| bob   | eleven | red    |
| bob   | one    | red    |
| bob   | eight  | red    |
| bob   | eight  | red    |
| bob   | eight  | red    |
| terry | seven  | yellow |
| terry | seven  | yellow |
| terry | seven  | gold   |

Sheet 2:

| COL A | COL B  | COL C  |
|-------|--------|--------|
| bob   | eleven | blue   |
| bob   | eleven | red    |
| bob   | eight  | red    |
| terry | seven  | yellow |
| terry | seven  | orange |

According to the example data, there should be three cells (B1, B5 and C11) marked red in Sheet 1 of new.xlsx. E.g.

enter image description here

This is a scenarios example PDF of how the checking should take place:

enter image description here

Code:

// Import the library
var Excel = require('exceljs'),
    moment = require('moment'),
    // Define Excel filename
    ExcelFile = 'so.xlsx',
    // Read from the file
    workbook = new Excel.Workbook();

workbook.xlsx.readFile(ExcelFile)
    .then(function()
    {
        // Use workbook
        var dataSheet = workbook.getWorksheet('Sheet 1'),
            masterSheet = workbook.getWorksheet('Sheet 2');

        dataSheet.eachRow({ includeEmpty: false }, function(dataRow, dataRowNumber)
        {
            var dataRowCells =
                {
                    dataCell1: dataRow.getCell('A'),
                    dataCell2: dataRow.getCell('B'),
                    dataCell3: dataRow.getCell('C')
                },
                isdataRowOK = false,
                oneOfBestMasterRowNumber,
                cellNames = ['A','B','C'];

            masterSheet.eachRow({ includeEmpty: false }, function(masterRow, masterRowNumber)
            {
                if(!isdataRowOK)
                {
                    var numberOfGoodCellsInRow = 0;

                    for(var i = 1; i < 4; i++)
                        if(dataRowCells['dataCell' + i].value === masterRow.getCell(cellNames[i-1]).value)
                            numberOfGoodCellsInRow++;

                    if(numberOfGoodCellsInRow == 2)
                        oneOfBestMasterRowNumber = masterRowNumber;

                    if(numberOfGoodCellsInRow == 3)
                        isdataRowOK = true
                }
            });


            if(!isdataRowOK)
            {
                var masterRowForCheck = masterSheet.getRow(oneOfBestMasterRowNumber);

                for(var i = 1; i < 4; i++)
                {
                    var dataCell = dataRowCells['dataCell' + i];
                    if(dataCell.value !== masterRowForCheck.getCell(cellNames[i-1]).value)
                    {
                        // Mark this failed cell as color red
                        dataCell.style = Object.create(dataCell.style); // Shallow-clone the style, break references
                        dataCell.fill = {type: 'pattern', pattern:'solid', fgColor:{argb:'FA8072'}}; // Set background
                    }
                }

            }
        });

        return workbook.xlsx.writeFile('new.xlsx');
    });
dom_ahdigital
  • 1,651
  • 18
  • 37
  • Why you not handle rejection in promise? Something like `workbook.xlsx.readFile(ExcelFile) .then(function(){....}) .catch(function(e){console.log(e.stack);})`? – Vasyl Moskalov Dec 17 '18 at 06:40

1 Answers1

6

The source for this error was the second call from line:

if(!isdataRowOK)

On this place the variable oneOfBestMasterRowNumber should be not undefined if 3 cells in row are bad because after it if getRow('undefined') then we have an error. And because of it we check it now before like follows and write for it the value 1 (1. Row number) because it does not matter in this case.

if(oneOfBestMasterRowNumber == void 0) //void 0 - undefined
    oneOfBestMasterRowNumber = 1;

And one place more we should change: instead of code lines:

if(numberOfGoodCellsInRow == 2)
    oneOfBestMasterRowNumber = masterRowNumber;

We should write the lines:

if(numberOfGoodCellsInRow > numberOfGoodCellsInBestRow)
{
    numberOfGoodCellsInBestRow = numberOfGoodCellsInRow;
    oneOfBestMasterRowNumber = masterRowNumber;
}

Because not only one cell could be bad in the row. We have also the new variable numberOfGoodCellsInBestRow also now.

I think it is better if two or three cells in the row are marked if they are bad, but if you want only one cell in the row then you can uncomment break; after cell colorizing.

See also my comments in the code below:

// Import the library
var Excel = require('exceljs'),
    moment = require('moment'),
    // Define Excel filename
    ExcelFile = 'so.xlsx',
    // Read from the file
    workbook = new Excel.Workbook();

workbook.xlsx.readFile(ExcelFile)
    .then(function()
    {
        // Use workbook
        var dataSheet = workbook.getWorksheet('Sheet 1'),
            masterSheet = workbook.getWorksheet('Sheet 2');

        dataSheet.eachRow({ includeEmpty: false }, function(dataRow, dataRowNumber)
        {
            var dataRowCells =
                {
                    dataCell1: dataRow.getCell('A'),
                    dataCell2: dataRow.getCell('B'),
                    dataCell3: dataRow.getCell('C')
                },
                isdataRowOK = false,
                oneOfBestMasterRowNumber,
                cellNames = ['A','B','C'],
                numberOfGoodCellsInBestRow = 0;

            masterSheet.eachRow({ includeEmpty: false }, function(masterRow, masterRowNumber)
            {
                if(!isdataRowOK)
                {
                    var numberOfGoodCellsInRow = 0;

                    for(var i = 1; i < 4; i++)
                        if(dataRowCells['dataCell' + i].value === masterRow.getCell(cellNames[i-1]).value)
                            numberOfGoodCellsInRow++;

                    //here we detect if 1 or 2 cells are bad
                    if(numberOfGoodCellsInRow > numberOfGoodCellsInBestRow)
                    {
                        numberOfGoodCellsInBestRow = numberOfGoodCellsInRow;
                        oneOfBestMasterRowNumber = masterRowNumber;
                    }

                    if(numberOfGoodCellsInRow == 3)
                        isdataRowOK = true
                }
            });

            //here was error source: oneOfBestMasterRowNumber
            //should be not undefined if 3 cells in row are bad
            if(oneOfBestMasterRowNumber == void 0)//void 0 - undefined
                oneOfBestMasterRowNumber = 1;

            if(!isdataRowOK)
            {
                //here was error place: if getRow('undefined') then was error
                var masterRowForCheck = masterSheet.getRow(oneOfBestMasterRowNumber);

                for(var i = 1; i < 4; i++)
                {
                    var dataCell = dataRowCells['dataCell' + i];
                    if(dataCell.value !== masterRowForCheck.getCell(cellNames[i-1]).value)
                    {
                        // Mark this failed cell as color red
                        dataCell.style = Object.create(dataCell.style); // Shallow-clone the style, break references
                        dataCell.fill = {type: 'pattern', pattern:'solid', fgColor:{argb:'FA8072'}}; // Set background
                        //break; uncomment this if you want only one cell as color red marked in the row
                    }
                }

            }
        });

        return workbook.xlsx.writeFile('new.xlsx');
    });
}

Example visualisation

I have translated the Node.js code into client JavaScript code to demonstrate it with checking from two different datasets (imitations of Excel tables) in the snippet below :

function checkDataset(obj)
{
    var radios = obj.parentNode.elements['dataset'],
        dataSetNumber,
        i = radios.length;

    for(; i--;)
        if(radios[i].checked)
        {
            dataSetNumber = i;
            break
        }

    var dataSheet = xlsFile['data' + dataSetNumber],
        masterSheet = xlsFile['master' + dataSetNumber];

    dataSheet.forEach(function(dataRow, dataRowIndex)
    {
        var dataRowCells =
            {
                dataCell1: dataRow.A,
                dataCell2: dataRow.B,
                dataCell3: dataRow.C
            },
            isdataRowOK = false,
            oneOfBestMasterRowIndex,
            cellNames = ['A','B','C'],
            numberOfGoodCellsInBestRow = 0;

        masterSheet.forEach(function(masterRow, masterRowIndex)
        {
            if(!isdataRowOK)
            {
                var numberOfGoodCellsInRow = 0;

                for(var i = 1; i < 4; i++)
                    if(dataRowCells['dataCell' + i].value === masterRow[cellNames[i-1]].value)
                        numberOfGoodCellsInRow++;

                //here we detect if 1 or 2 cells are bad
                if(numberOfGoodCellsInRow > numberOfGoodCellsInBestRow)
                {
                    numberOfGoodCellsInBestRow = numberOfGoodCellsInRow;
                    oneOfBestMasterRowIndex = masterRowIndex;
                }

                if(numberOfGoodCellsInRow == 3)
                    isdataRowOK = true
            }
        });

        //oneOfBestMasterRowIndex should be not undefined if 3 cells in row are bad
        if(oneOfBestMasterRowIndex == void 0)//void 0 - undefined
            oneOfBestMasterRowIndex = 0;

        if(!isdataRowOK)
        {
            var masterRowForCheck = masterSheet[oneOfBestMasterRowIndex];

            for(var i = 1; i < 4; i++)
            {
                var dataCell = dataRowCells['dataCell' + i];
                if(dataCell.value !== masterRowForCheck[cellNames[i-1]].value)
                {
                    // Mark this failed cell as color red
                    dataCell.bgColor = 'red';
                    //break; uncomment this if you want only one cell as color red marked in the row
                }
            }

        }
    });

    var table = '<table border="1"><tr style="background:#00a;color:#fff">'
                + '<th>A</th><th>B</th><th>C</th></tr>';

    dataSheet.forEach(function(dataRow, dataRowIndex)
    {
        table += '<tr>';
        for(var i in dataRow)
            table += '<td bgcolor="' + (dataRow[i].bgColor ? 'red' : '')
                        + '">' + dataRow[i].value + '</td>';

        table += '</tr>';
    });
    document.write(table + '</table>');
}

function cells(strValues)
{
    var v = strValues.split('\t');
    return{A: {value: v[0]}, B: {value: v[1]}, C: {value: v[2]}}
}

var xlsFile =
{
    //Dataset 1:
    data0:
    [
        cells('bob  sacsac  sxcsc'),
        cells('sacfbrb  eleven  blue'),
        cells('ascasc   one red'),
        cells('tyjytn   one red'),
        cells('ascsac   one red'),
        cells('terry    elf yellow'),
        cells('terry    seven   elf'),
        cells('terry    elf elf'),
        cells('terry    elf elf'),
        cells('terry    seven   yellow'),
        cells('terry    elf elf'),
        cells('terry    seven   orange'),
        cells('terry    seven   yellow'),
        cells('santa    mary    jane'),
        cells('bob  zero    mauve'),
        cells('bob  one silver'),
        cells('bob  eleven  blue'),
        cells('bob  eleven  red'),
        cells('bob  eleven  red'),
        cells('bob  one red'),
        cells('bob  eight   red'),
        cells('bob  eight   red'),
        cells('bob  eight   red'),
        cells('terry    seven   yellow'),
        cells('terry    seven   yellow'),
        cells('terry    seven   gold')
    ],
    master0:
    [
        cells('bob  eleven  blue'),
        cells('bob  eleven  red'),
        cells('bob  eight   red'),
        cells('terry    seven   yellow'),
        cells('bob  seven   yellow'),
        cells('terry    seven   orange'),
        cells('tiger    one red')
    ],

    //Dataset 2:
    data1:
    [
        cells('bob  one blue'),
        cells('bob  eleven  blue'),
        cells('bob  eleven  red'),
        cells('bob  eleven  red'),
        cells('bob  one red'),
        cells('bob  eight   red'),
        cells('bob  eight   red'),
        cells('bob  eight   red'),
        cells('terry    seven   yellow'),
        cells('terry    seven   yellow'),
        cells('terry    seven   gold')
    ],
    master1:
    [
        cells('bob  eleven  blue'),
        cells('bob  eleven  red'),
        cells('bob  eight   red'),
        cells('terry    seven   yellow'),
        cells('terry    seven   orange')
    ]
};
<form method="post" action="#">
    <p><b>Which dataset do you want check?</b></p>
    <label><input type="radio" name="dataset">Dataset 1</label><br>
    <label><input type="radio" name="dataset" checked>Dataset 2</label><br><br>
    <input type="button" value="Check it" onclick="checkDataset(this)">
</form>
VLAZ
  • 26,331
  • 9
  • 49
  • 67
Bharata
  • 13,509
  • 6
  • 36
  • 50
  • @dom_ahdigital, would you like to upload for me the 'so.xlsx' with all new cases please. But **not** your product version. – Bharata Dec 17 '18 at 15:41
  • This is the one I've been using to test some different scenarios: https://www.dropbox.com/s/3hbewhmca33gaqj/so.xlsx?dl=0 – dom_ahdigital Dec 17 '18 at 16:30