2

Wondering if it's possible in Excel M365 to sum part of a multidimensional dynamic array, with criteria using a non-looping solution. I am looking to preserve rows of the 'DataTable' dynamic array, and only sum columns for category A and B.

Thanks in advance!

Input:

enter image description here

Desired solution:

enter image description here

I've done this in VBA using a looping solution, but would prefer to build logic into Excel dynamic arrays.

Blendr
  • 33
  • 4

2 Answers2

2
=HSTACK(BYROW(F8#,LAMBDA(x_,SUMIFS(x_,F4#,"A"))),BYROW(F8#,LAMBDA(x_,SUMIFS(x_,F4#,"B"))))

A, B - row dim preserved


not sure what you mean by 'non-looping' in terms of an Excel function - notwithstanding this does the trick with the 365 Office compatible version per your Q.

PS - assumes col headers F4:H4 and matrix F8:H10 are (in keeping with the 'dynamic' notion) series and also assumes that there are no obstructive adjacent cells that might result in a "#Spill!' error).

For another recent solution of mine that applies byrow to express the filter function as a series see here.

JB-007
  • 2,156
  • 1
  • 6
  • 22
1

The following approach works for an arbitrary number of columns and letters, that could be repeated. It spills the entire result, no need to drag the formula down:

=LET(ltrs,B1:D1, in,B4:D6, ux,UNIQUE(ltrs,1),
 out, DROP(REDUCE("",ux,LAMBDA(ac,a, HSTACK(ac,
 MMULT(in,TOCOL(N(ltrs=a)))))),,1),VSTACK(ux,out))

It uses REDUCE/HSTACK pattern(1). It iterates over each unique letters (ux) and concatenates the result of MMULT horizontally via HSTACK, which calculates the sum of the columns that corresponds to a given letter (a).

Here is the output: output

Other alternatives (more verbose):

Using MAKEARRAY: Usually MAKEARRAY is more efficient than the combination of REDUCE/HSTACK, but I don't think for this case, since the second input argument of FILTER is the same for every row of the same column. I put it here for illustrative purposes, since the first formula is more concise and probably more efficient for a large dataset with a high frequency of repeated letters.

=LET(ltrs,B1:D1, in,B4:D6, ux,UNIQUE(ltrs,1), seq,SEQUENCE(,COLUMNS(ltrs)),
 out, MAKEARRAY(ROWS(in),COLUMNS(ux),
  LAMBDA(r,c,SUM(INDEX(in,r,FILTER(seq, 1-ISNA(XMATCH(ltrs,INDEX(ltrs,c)))))))),
 VSTACK(ux,out))

Using BYROW, CHOOSECOLS and REDUCE/HSTACK pattern(1): It could be more efficient than the first approach, since the sum only considers a subset of the columns that match the letter. The first approach does the multiplication for the entire matrix on each iteration.

=LET(ltrs,B1:D1, in,B4:D6, ux,UNIQUE(ltrs,1), seq,SEQUENCE(,COLUMNS(ltrs)),
 out, DROP(REDUCE("",ux,LAMBDA(ac,a, HSTACK(ac,
  BYROW(CHOOSECOLS(in,FILTER(seq,1-ISNA(XMATCH(ltrs,a)))),
   LAMBDA(x,SUM(x)))))),,1),VSTACK(ux, out))

Here some performance results that confirm what we suspect. For 1000 rows and 100 columns:

MMULT MAKEARRAY BYROW
90ms 890ms 50ms

Testing for larger number of rows the BYROW solution trumps over the other two. If the letters are highly repeated, i.e. the output has a fewer number of columns, then MMULT performs slightly better.

(1): Check my answer to the following question: how to transform a table in Excel from vertical to horizontal but with different length

David Leal
  • 6,373
  • 4
  • 29
  • 56
  • 1
    This is exactly the solution I was after. The analysis of performance is also very helpful! After some initial testing with my larger raw dataset, have also found approach 1 to be slightly faster and more readable. Thanks again! – Blendr May 22 '23 at 13:38
  • Thanks @Blendr good to know. The performance also depends on the specific nature of the data sample. Depending on that you will have more or less columns in the output. – David Leal May 22 '23 at 16:20
  • @DavidLeal great variety of several soln - I only figured the byrow (with these timedifferences - would you expect to see significant / material lag when implementing over 50k rows ? beyond that, you need to have decent RAM [assuming a decent number of cols] to avoid the 'spinning circle of white death' Excel page ☺ – JB-007 May 27 '23 at 17:00
  • @JB-007 all depends on the number of columns rather than the number of rows, I would say. I tested `10K` rows and `100` columns both solutions: `MMULT` and `BYROW` in `300`ms. For `20K` rows, `MMULT` around `500`ms and `BYROW` `700`ms. So I would say for `60K` rows it would be less than 3 seconds. – David Leal May 27 '23 at 23:42