I have a project to compare report outputs between one system and another. I thought using Excel as the output format would enable me to compare the outputs easily. The layout of the reports doesn't lend itself to using CSV files.
Problems:
- Although the data in the outputs may be the same, it may be sorted differently.
- The report could be formatted with numerous data ranges.
I know there are several tools to perform Excel file comparisons, but none of them appear to ignore the sort of the data. They appear to be simple, cell-by-cell comparisons.
So, before I compare all of the cells between the two worksheets, I need to sort the data within each region. Trying to identify all of the regions on a worksheet I ran into a problem. The CurrentRegion
property appears to produce incorrect results.
(Too much back story?)
Using Microsoft Excel 2016 and Microsoft Visual Basic for Applications 7.1.
I'm using code like this:
Debug.Print Range("A1").CurrentRegion.Address
In Microsoft's documentation, the CurrentRegion property is defined as
Returns a Range object that represents the current region. The current region is a range bounded by any combination of blank rows and blank columns. Read-only.
The documentation doesn't specify, but I assume that the CurrentRegion of a cell that is in a blank row or blank column is just that cell.
Given the sample data below, UsedRange correctly returns $A$1:$E$13.
Here is a matrix showing the values I expect and the values I get from CurrentRegion:
╔══════╦════════════╦════════════╗
║ Cell ║ Expected ║ Actual ║
╠══════╬════════════╬════════════╣
║ A1 ║ $A$1 ║ $A$1:$B$1 ║
║ B1 ║ $B$1 ║ $B$1 ║
║ B2 ║ $B$2 ║ $A$1:$C$4 ║
║ B3 ║ $A$3:$C$4 ║ $A$3:$C$4 ║
║ B6 ║ $B$6 ║ $B$6 ║
║ B7 ║ $B$7 ║ $B$7:$E$13 ║
║ C10 ║ $C$7:$E$13 ║ $C$7:$E$13 ║
╚══════╩════════════╩════════════╝
What am I doing wrong? Is there some configuration that I have wrong? Some setting I missed? Or did Microsoft fail to test boundary conditions?
Is there a better way to identify each region on a worksheet?
Is there a better way to compare these files?
╔════╦═══╦════╦══════╦══════╦═════╦═══╗
║ ║ A ║ B ║ C ║ D ║ E ║ F ║
╠════╬═══╬════╬══════╬══════╬═════╬═══╣
║ 1 ║ ║ 1 ║ ║ ║ ║ ║
║ 2 ║ ║ ║ ║ ║ ║ ║
║ 3 ║ 4 ║ 10 ║ ║ ║ ║ ║
║ 4 ║ ║ ║ 2 ║ ║ ║ ║
║ 5 ║ ║ ║ ║ ║ ║ ║
║ 6 ║ ║ ║ ║ ║ ║ ║
║ 7 ║ ║ ║ ║ asdf ║ ║ ║
║ 8 ║ ║ ║ asdf ║ rtuy ║ ║ ║
║ 9 ║ ║ ║ asdf ║ vbnm ║ ║ ║
║ 10 ║ ║ ║ vbnm ║ ║ ║ ║
║ 11 ║ ║ ║ vbnm ║ ║ ║ ║
║ 12 ║ ║ ║ fgjh ║ rtyu ║ jkl ║ ║
║ 13 ║ ║ ║ fghj ║ ║ ║ ║
║ 14 ║ ║ ║ ║ ║ ║ ║
╚════╩═══╩════╩══════╩══════╩═════╩═══╝
To clarify based on Eric Murpy's answer
The layout can be anything. But for any given report, it will match between the source and target systems.
I considered sorting the entire UsedRange. But a report may have two tables next to each other. I would need to sort the two tables independently of each other.
No cell highlighting. I'll be comparing about 20,000 reports, so the final solution will be VBScript, JavaScript, or PowerShell that either uses or reproduces the functionality I'm trying to create in VBA. The output will be text files.
The reports in the different reporting environments will be generating SQL using different SQL generation routines. They should produce the same results, but there's no guarantee the data will be returned from the database server in the same order. This wouldn't be a problem if the output was deliberately sorted in the reports, but that's not a feature all report developers care about.
The specific task I am having trouble with is accurately identifying all of the data regions in each worksheet so I can sort them. (This will be considerably faster than manually editing each report to ensure the sorting is correct.) Then the cell-by-cell comparison between the two spreadsheets will work.