-1

I have an array which stores it's values in a sorted list. I have been using this sorted list to organise data, by date in several other spreadsheets.

My source data is a series of 12 worksheets in one workbook. Each worksheet reflecting a single calendar month. Number of transactions/runs is dynamic--averages 60 or so a month so I set a limit to my loop of 200 as this should be more than enough to cover any growth in business.

My current set of data is such that I have several repeat deliveries (different cargo/weights et al. but the same delivery location). I want to consolidate those "repeat"/similar rows into a single entry in the list, sum the number of pieces delivered, weight and delivery cost, and to increment a counter to show the number of repeated deliveries to that respective site.

Example: January, 2016
Delivered from:    Delivered to:    No. Pieces:    Weight:    Cost:
Site A             Site B           10             100        $120.00
Site A             Site C           5              20         $80.00
Site B             Site C           2              30         $45.00
Site A             Site C           20             460        $375.00

Summary:
Delivered to:    No. of Deliveries:    No. Pieces:    Weight:    Cost:
Site B           1                     10             100        $120.00
Site C           3                     27             510        $500.00

I can think of ways to do this by dumping data to a "scrap" worksheet, however, I want a VBA solution which is "internal" so that no such "scratch pad" is required.

The number of deliveries, in total, is dynamic. The number of repeat deliveries, for any given location, is also dynamic.

I am finding it very difficult to compose an efficient way to consolidate the information in my list with the above parameters as I am still very new to VBA/Excel.

Any suggestions are appreciated, in particular if you have example code--I know what I want, I'm just not sure how to implement it in VBA.

A sample of my array loading and transfer to the list is shown below (with variable definitions et al. omitted).

    Set List = CreateObject("System.Collections.SortedList")

    'Grab Monthly Data by Route
    For Each ws In Worksheets
        If ws.Name <> "Summary" Then

            Call DeleteHidden 'Delete Hidden Rows/Columns in the active worksheet if any

     With ws
            'loop through the sheet to 207 (~3x greatest number of deliveries)
            For RowCount = 7 To 207
                'Check for dates for each row (Month/Day/Year)
                d = DateValue(.Cells(RowCount, 1))

                If List.Containskey(d) Then
                    arTemp = List(d)
                Else
                    ReDim arTemp(12)
                End If

                'Monthly Totals
                arTemp(0) = arTemp(0) + .Cells(RowCount, 1) 'Grab Entry Date/Time
                arTemp(1) = arTemp(1) + .Cells(RowCount, 2) 'Grab Delivery Date/Time
                arTemp(2) = arTemp(2) + .Cells(RowCount, 3) 'Grab PU Location
                arTemp(3) = arTemp(3) + .Cells(RowCount, 4) 'Grab PU Street
                arTemp(4) = arTemp(4) + .Cells(RowCount, 5) 'Grab PU City/Province/PC
                arTemp(5) = arTemp(5) + .Cells(RowCount, 6) 'Grab Del Location
                arTemp(6) = arTemp(6) + .Cells(RowCount, 7) 'Grab Del Street
                arTemp(7) = arTemp(7) + .Cells(RowCount, 8) 'Grab Del City/Province/PC
                arTemp(8) = arTemp(8) + .Cells(RowCount, 9) 'Grab No. Pieces
                arTemp(9) = arTemp(9) + .Cells(RowCount, 10) 'Grab Cargo Weight (LBS)
                arTemp(10) = arTemp(10) + .Cells(RowCount, 11) 'Grab Cost 
                'potential add point of a sort and consolidate function if working with the array prior to data being added to the list (but then such would run for each record of each worksheet---seems too inefficient)
                arTemp(12) = arTemp(12) + 1
                List(d) = arTemp
    Next RowCount
            Call QuickSort(arTemp, 0, RowCount - 1) 'Sort the Monthly Array at the end of the Month (can manipulate the array but the list is already loaded..how to manipulate/consolidate the list???)
        End With
    End If
Next
WarOrdos
  • 55
  • 9
  • What is the purpose of the QuickSort? You are only sorting the value of the last key. The SortedList automatically sorts the keys. –  Nov 01 '16 at 03:34
  • Are you just trying to create the summary? If so, then why are there more fields in your array then in summary? Why are you concatenating address fields `arTemp(7) = arTemp(7) + .Cells(RowCount, 8) 'Grab Del City/Province/PC` –  Nov 01 '16 at 03:56
  • Do you want the summary for each worksheet\month or for the entire year\12 worksheets? – EEM Nov 01 '16 at 04:07
  • The code above was a sample of what I was already doing. 12 worksheets 1 month per worksheet. The arTemp's listed reflect the layout of the worksheet columns (each entry-row) has those fields. The quicksort was more of a place holder, though I do have a working quicksort for other applications. it seemed the logical place to have the "compare/consolidate" section of code. (I tend to think in array terms and you'd sort an array prior to consolidating any entries in it etc.) I want a summary of each worksheet (summary for each month) sorted by date, but with like destinations consolidated. – WarOrdos Nov 01 '16 at 09:46
  • My code consolidates the whole year. I'll try to modify it, sometime today. –  Nov 01 '16 at 10:05
  • I plan to expand to have both annual and monthly consolidation. However, at the moment monthly is what was more "necessary" and I figured I could "consolidate" the monthly figures, which are output to the summary page at that point (though doing both on the fly would be nice and I'll see about coding that). – WarOrdos Nov 02 '16 at 01:02

3 Answers3

3

Using ADO, it is possible to treat an Excel workbook as a database, and issue SQL statements against it.

(I had trouble with periods in the field name, so I changed No. Pieces to Number of Pieces in the original data. Thanks @ThomasInzina.)

SELECT [Delivered to:], 
    COUNT(*) AS NumberOfDeliveries, 
    SUM([Number of Pieces:]) AS NumberOfPieces,
    SUM([Weight:]) AS SumOfWeight,
    SUM([Cost:]) AS SumOfCost
FROM [January, 2016$]
GROUP BY [Delivered to:]

The first step would be to get a list of worksheet names, using the ADO connection.

Then you can iterate over the names and issue the SQL statement. Data comes back as a Recordset object, which can easily be pasted into an Excel worksheet using the CopyRecordset method.

If the output would be to a different workbook, then it would be possible to keep the output workbook open during the whole of the For Each, continuously create new worksheets for each month, and call CopyFromRecordset at each iteration of the For Each. However, when accessing the same workbook via Automation and the ADO connection simultaneously, CopyFromRecordset seemed to do nothing.

Therefore, we're using disconnected recordsets for each worksheet — that store all the data in memory even after the collection is closed; and holding references to them using a Scripting.Dictionary, where each key is the final worksheet name, and the value is the disconnected recordset.

This means that all the final data is stored in memory, which could conceivably be an issue. A possible workaround would be to create a new output workbook to hold the pasted recordset data, and when all the iterations are finished and the connection is closed, to paste the worksheets from the output workbook into the original workbook and delete the output workbook. However, you've indicated in the question that you don't want to do this.

Add references (Tools -> References ...) to Microsoft ActiveX Data Objects (choose the latest version; it's usually 6.1), and Microsoft Scripting Runtime.

Dim pathToWorkbook As String
pathToWorkbook = "C:\path\to\workbook.xlsx"

Dim conn As New ADODB.Connection
Dim schema As ADODB.Recordset
Dim sheetname As Variant
Dim sql As String
Dim rs As ADODB.Recordset
Dim dict As New Scripting.Dictionary

With conn
    .Provider = "Microsoft.ACE.OLEDB.12.0"
    .ConnectionString = "Data Source=""" & pathToWorkbook & """;" & _
        "Extended Properties=""Excel 12.0;HDR=Yes"""
    .Open

    Set schema = .OpenSchema(adSchemaTables)

    For Each sheetname In schema.GetRows(, , "TABLE_NAME") 'returns a 2D array of one column
        If Not sheetname Like "*(Summary)*" Then
            sql = _
                "SELECT [Delivered to:], " & _
                    "COUNT(*) AS NumberOfDeliveries, " & _
                    "SUM([Number Of Pieces:]) AS SumNumberOfPieces, " & _
                    "SUM([Weight:]) AS SumOfWeight, " & _
                    "SUM([Cost:]) AS SumOfCost " & _
                "FROM [" & sheetname & "] " & _
                "GROUP BY [Delivered to:]"

            Set rs = New ADODB.Recordset
            rs.CursorLocation = adUseClient 'This defines a disconnected recordset
            rs.Open sql, conn, adOpenStatic, adLockBatchOptimistic 'Disconnected recordsets require these options
            Set rs.ActiveConnection = Nothing 'Recordset disconnected

            sheetname = Mid(sheetname, 2, Len(sheetname) - 3)
            dict.Add sheetname & " (Summary)", rs
        End If
    Next
    .Close
End With

Dim xlApp As New Excel.Application
xlApp.Visible = True
xlApp.UserControl = True
Dim wkbk As Excel.Workbook
Dim wks As Excel.Worksheet
Dim key As Variant
Set wkbk = xlApp.Workbooks.Open(pathToWorkbook)
For Each key In dict.Keys
    Set wks = wkbk.Sheets.Add
    wks.Name = key
    wks.Range("A1").CopyFromRecordset dict(key)
Next

Links:

MSDN:

Other:

Graham
  • 7,431
  • 18
  • 59
  • 84
Zev Spitz
  • 13,950
  • 6
  • 64
  • 136
  • @ Zev: Thank-you for your suggestion. I've started taking a look at the code put up by Thomas, but will look through your submission as well (it's been a 14hr day so that will have to wait until tomorrow) – WarOrdos Nov 02 '16 at 00:58
  • @WarOrdos I've rewritten this answer using tested code; there were a number of bugs as pointed out by [Thomas Inzina](http://stackoverflow.com/users/6432984/thomas-inzina). I've also added a number of links. – Zev Spitz Nov 03 '16 at 19:02
  • @ZevSpitz Super Answer!! –  Nov 05 '16 at 03:39
1

I added a month column in the summary.

Sub Summary()
    Dim ws As Worksheet
    Dim iMonth As Integer, x As Long, x1 As Long
    Dim Data, key
    Dim list(1 To 12) As Object

    For x = 1 To 12
        Set list(x) = CreateObject("System.Collections.SortedList")
    Next

    For Each ws In Worksheets
        If ws.Name <> "Summary" Then
           Call DeleteHidden    'Delete Hidden Rows/Columns in the active worksheet if any
            With ws

                For x = 1 To 207
                    If IsDate(.Cells(x, 1)) Then
                        iMonth = Month(.Cells(x, 1))
                        key = .Cells(x, 6)    'Grab Del Location

                        If list(iMonth).ContainsKey(key) Then
                            Data = list(iMonth)(key)
                        Else
                            ReDim Data(5)
                            Data(0) = iMonth
                            Data(1) = .Cells(x, 6)    'Grab Del Location
                        End If

                        Data(2) = Data(2) + 1
                        Data(3) = Data(3) + .Cells(x, 9)    'Grab No. Pieces
                        Data(4) = Data(4) + .Cells(x, 10)    'Grab Cargo Weight (LBS)
                        Data(5) = Data(5) + .Cells(x, 11)    'Grab Cost

                        list(iMonth)(key) = Data
                    End If
                Next
            End With
        End If
    Next

    With Worksheets("Summary")
        For x = 1 To 12
            For x1 = 0 To list(x).Count - 1
                .Range("A" & .Rows.Count).End(xlUp).Offset(1).Resize(1, 6).Value = list(x).GetByIndex(x1)
            Next
        Next
    End With
End Sub
  • I've been called into work so I can't properly review your code suggestion at this time. I will look it over this evening in detail. However, where you have Trim(key)....where is "trim" itself defined, or is it a built in function of vba? The line: key = .cells(x,6) sets the value of key with the del location. if Trim(Key) <> vbNullString (checks to see the field/cell is not blank) if list.containskey(key) (checks to see if the list has that key at that point) if the list has key then load up Data, else grab the next location and store info to data then to the list. What is Trim? – WarOrdos Nov 01 '16 at 09:53
  • Thank-you for your code. I corrected a typo @ Data(3)=Data(4) to be Data(4)=Data(4)... – WarOrdos Nov 02 '16 at 00:28
  • I have another issue with implementing your code. Your code works fine for parsing through the first worksheet. Consolidation and ordering are fine as are the tallies. However, Month never increments--it always stays as 1. It does not move to subsequent worksheets but rather loops through the first worksheet 12 times. I will look at this further (I just finished a 14hr shift), but if you have suggestions in the in term they would be appreciated. – WarOrdos Nov 02 '16 at 00:57
  • The month is derived from the **Entry Date/Time** `iMonth = Month(.Cells(x, 1))`. If the **Entry Date/Time** is not consistent with he way you want your data groped; you may want to replace the array of dictionaries with a dictionary of dictionaries. You could use the worksheet names as keys for master dictionary. –  Nov 02 '16 at 05:47
  • I've looked through the worksheets and the date format is consistent. I'll look in greater detail this afternoon. – WarOrdos Nov 02 '16 at 11:19
  • I found the issue and it was not related to your code. Dates were fine and consistent, however, there was a hidden column on some of the sheets, which I overlooked when I purged the first set of worksheets, and thus the dates were offset. I'm still testing the other SQL suggestions but I have accepted your answer as your implementation makes the most sense to me "out of the box". Thank-you for your input. – WarOrdos Nov 02 '16 at 21:15
  • @WarOrdos _your implementation makes the most sense to me "out of the box"_ -- This is an **imperative** solution -- you're telling the computer what to do at each step of the way. This is harder to read -- you have to understand all the parts of the code in order to understand the desired end result -- and correspondingly harder to modify. SQL is **declarative** -- you are describing the shape of the data, and the underlying engine will figure out how to get that data shape. Understanding the end-shape is simpler, as is changing that end-shape. – Zev Spitz Nov 02 '16 at 22:23
  • @WarOrdos See [here](http://stackoverflow.com/questions/1784664/what-is-the-difference-between-declarative-and-imperative-programming) for a similar discussion regarding LINQ in C# and other .NET languages. – Zev Spitz Nov 02 '16 at 22:24
  • @ Zev: I very much appreciate your differentiation of the two formats (imperative coding vs. the declarative of SQL). I will be sure to read the provided link in detail. (I have also up voted the SQL submission). – WarOrdos Nov 02 '16 at 23:20
  • @ZevSpitz I up-voted your answer before I posted my own. I like your approach but it lacks implementation. –  Nov 02 '16 at 23:38
  • What implementation is lacking? It appears to be a complete solution. The only issue is that it's appending to multiple sheets and I'm not sure that is what the OP wants. – Zev Spitz Nov 02 '16 at 23:40
  • "It appears to be a complete solution. The only issue is that it's appending to multiple sheets "? It is appending to multiple sheets each of which will throw an error when queried, `[No. Pieces]` will throw an error (the period isn't allowed), `.CopyRecordset` should be `.CopyFromRecordset` and `CopyRecordset(rs)` will throw an error because of the parenthesis. –  Nov 02 '16 at 23:55
  • @Thomas: How does one use multiple keys, or "dictionary of dictionaries"? What I've gone in and tried to do, at this point, is have two keys running, the delivery name and the delivery address (having realised that the people who setup the source data used the same name for multiple addresses (ie. FedEx @ 123 nowhere, FedEx @ 321 somewhere street etc.). So I've since setup the code to have as the primary key the delivery address. (works) This in turn has the sort being by the address (with respect to output). I would like to include the delivery name and thus sort by both name and address. – WarOrdos Nov 03 '16 at 00:53
  • @Thomas: Continued: I have been getting type mismatch as a compile error whenever I try to use more than one key at a time. the key definition works fine etc., but trying to have both keys active does not: I get a type mismatch on this line: list(iMonth)(key)(key2) = Data – WarOrdos Nov 03 '16 at 01:03
  • You are getting the error because one of the objects in the chain isn't instantiated. When creating a dictionary of dictionaries you need to instantiate a temp dictionary and add that to the parent dictionary. You may consider using a composite key instead. For example `key = DeliveryName & "|" & DeliveryAddress : list(iMonth)(key) = Data` should return the same results as `list(iMonth)(DeliveryName)(DeliveryAddress) = Data` without the extra layer of complexity. Note: I use the pipe as a delimiter between DeliveryName and DeliveryAddress to ensure that there are no naming conflicts. –  Nov 03 '16 at 01:27
  • @ThomasInzina I think you meant to say _the implementation is lacking_ -- the implementation has problems -- instead of _it lacks implementation_ -- it has no implementation at all. In any case, I've fixed and tested the code. – Zev Spitz Nov 03 '16 at 19:00
1

Here is shorter lazier version that will aggregate the example data into a 2D array, but it assumes that A6:E6 has the same header names as in your example:

Dim arr(), rs As Object: Set rs = CreateObject("ADODB.Recordset")

rs.Open "Select [Delivered to:], Count(*), Sum([No# Pieces:]), " & _
    "Sum([Weight:]), Format(Sum([Cost:]),'$0.00') " & _
    "From ( SELECT * From [January$A6:E207] Union All " & _
    "       SELECT * From [February$A6:E207] ) " & _
    "Where [Delivered to:] > ''  Group By [Delivered to:]", _
    "Provider=MSDASQL;DSN=Excel Files;DBQ=" & ThisWorkbook.FullName

If Not rs.EOF Then arr = rs.GetRows ': For Each i In arr: Debug.Print i & " ";: Next
rs.Close: Set rs = Nothing

If there are no header cells, this alternative version needs the ACE Provider to be installed (comes with Access 2007 and above, or can be downloaded and installed separately)

rs.Open "Select F2, Count(*), Sum(F3), Sum(F4), Format(Sum(F5),'Currency') " & _
    "From ( SELECT * From [January$A6:E207] Union All " & _
    "       SELECT * From [February$A6:E207]          )  Where F2 > ''  Group By F2", _
    "Provider=Microsoft.ACE.OLEDB.12.0;Extended Properties='Excel 12.0;HDR=No';Data Source=" & ThisWorkbook.FullName ' ODBC Provider in case no ACE Provider
Slai
  • 22,144
  • 5
  • 45
  • 53
  • I will review your suggestion as well, after some rest. I've never treated Excel as a database so the concept is new to me and will require some additional reading. – WarOrdos Nov 02 '16 at 00:59