1

I need to find a performant and smart way to redesign the formula or the tables on which it depends (COSTO_DUMMY and GG_TARGET). Can you help me, please? I can add new support tables if needed

COSTO_DUMMY

key cost
AP01100GENNAIO 33.9492
AP01100FEBBRAIO 32.108
AP01100MARZO 27.889
AP01100APRILE 34.7004
AP01100MAGGIO 29.2037
AP01100GIUGNO 33.3176
AP01100LUGLIO 31.6459
AP01100AGOSTO 49.5292
AP01100SETTEMBRE 29.51
AP01100OTTOBRE 31.129
AP01100NOVEMBRE 30.776
AP01100DICEMBRE 34.7

GG_TARGET

key days
0050001643GENNAIO 16.2
0050001643FEBBRAIO 18.4
0050001643MARZO 21.5
0050001643APRILE 16.7
0050001643MAGGIO 20.4
0050001643GIUGNO 17.5
0050001643LUGLIO 18.4
0050001643AGOSTO 7.5
0050001643SETTEMBRE 20.4
0050001643OTTOBRE 19
0050001643NOVEMBRE 19.5
0050001643DICEMBRE 15.8

FACT_TABLE (the structure of this table cannot be modified); I have two column headers in a fixed position: the first one with AP% codes, the second one with the months

AP01100 AP01100 AP01100 AP01100 AP01100 AP01100 AP01100 AP01100 AP01100 AP01100 AP01100 AP01100
GENNAIO FEBBRAIO MARZO APRILE MAGGIO GIUGNO LUGLIO AGOSTO SETTEMBRE OTTOBRE NOVEMBRE DICEMBRE
0050001643 1 4

I need to do this formula for each row of the fact table (to be added next to "DICEMBRE"):
(1 * jan_cost * jan_days + 1 * feb_cost * feb_days + ... + 1 * dec_cost * dec_days) +
(4 * feb_cost * feb_days + 4 * mar_cost * mar_days + ... + 4 * dec_cost * dec_days) +
(0 * mar_cost * mar_days + 0 * apr_cost * apr_days + ... + 0 * dec_cost * dec_days) +
.....
(0 * dec_cost * dec_days)

The meaning is: I want to recruit 1 person from january to december and 4 people from february to december (that's why the "4" from fact table ignores january information)

I started writing the following formula (only for the first person to recruit on january and it's missing a vlookup for gg_Target, anyway I think it's clear the way I am thinking about it... the wrong way) but I really hope there is something smarter:

=vlookup(concat(offset(G27;25-row();0);offset(G27;26-row();0));COSTO_DUMMY;2;FALSE)*G27 +
vlookup(concat(offset(G27;25-RIF.RIGA();1);offset(G27;26-row();1));COSTO_DUMMY;2;FALSE)*G27 ....

In this formula I have the "1" of the fact_table on G27 cell (hence offset(G27;25-row();0) gives me "AP01100" while offset(G27;26-row();0)) gives me "GENNAIO")
Please, note that I want a constant value, so I am not expecting to use ctrl+shift+enter. I have this office version excel version

Franco
  • 87
  • 9
  • How do you do the lookup in the COSTO DUMMY table, you don't have the value `0050001643`? Is it going to by `AP*` value and the month? Thanks! – David Leal Mar 07 '23 at 21:53
  • Hi David, I answered this question under your answer. Thank you! – Franco Mar 08 '23 at 10:00

1 Answers1

1

Assuming no Excel version constraints as per the tags listed in the question. The following is an array formula, that spills the entire result and considers more than one target key, even in the example there is only one value.

Assumptons and comments:

  1. The input data in both input tables are organized in a similar manner (otherwise we would need and additional lookup), i.e, the costs and days for a given key are organized following the same logic. The input data is organized chronologically in ascending order, but it works for descending too.
  2. The rows don't need to match, only the order within the same key for the dates should be the same for both tables.
  3. The information for a given key doesn't need to come one after another.
  4. For any keys there is data for the 12 months.
  5. The AP codes don't have to be the same (even in the sample data are all the same).

All previous assumption are based on the sample input data from the question. Here is the formula that covers a very general case, i.e. more than one target code, different AP codes:

=LET(setA, A2:B13, setB, C2:D13, fcts, F2:R4, loc, "[$-0410]mmmm",
 months, EDATE(1,SEQUENCE(1,12,0)), tMonths, UPPER(TEXT(months, loc)),
 lks, DROP(fcts,2), aps, TAKE(DROP(fcts,,1),1), idx, SEQUENCE(COLUMNS(months)),
 gMonth, LAMBDA(x, XLOOKUP(x, tMonths,months)), SPLIT,LAMBDA(x,LET(
  a,TEXTBEFORE(x,tMonths),b,SUBSTITUTE(x,a,""),HSTACK(a,gMonth(b)))),
 kA, SPLIT(TAKE(setA,,1)), cA, DROP(setA,,1), kB, SPLIT(TAKE(setB,,1)),
 cB, DROP(setB,,1), CALC, LAMBDA(k,ap,m, LET(
  fa, FILTER(cA, (INDEX(kA,,1)=ap) * (INDEX(kA,,2)>=m)),
  fb, FILTER(cB, (INDEX(kB,,1)=k) * (INDEX(kB,,2)>=m)), SUM(fa*fb))),
 BYROW(lks, LAMBDA(lk, LET(k, TAKE(lk,,1), mults, DROP(lk,,1),
  REDUCE(0, idx, LAMBDA(ac,i, LET(ap, INDEX(aps,,i), m, INDEX(months,,i),
   ac + INDEX(mults,,i) * CALC(k,ap,m))))))))

It is a large formula, because it requires several intermediate calculations. It depends on the input ranges for only three names: setA, setB and fcts. The rest of the names, are obtained from them, playing with DROP, TAKE, INDEX, etc, functions.

To do the calculation we filter for months greater or equal than. We convert the input months in text format into Excel dates. The months in date format (months) are generated using EDATE function combined with SEQUENCE. The month is represented as the first date of the month in date format (we use as a reference year 1900). We obtain the corresponding months in text format (tMonths) via TEXT function (no need to take it from the FACT table), using Italian as specific locale (locname). I took the idea from here: Specifying a Language for the TEXT Function.

Now we use the following helper functions we created as user LAMBDA functions:

  1. gMonth(x): Given an input date x in text format returns the corresponding month in date format. Created to make the formula easier to read and for debugging purposes.
  2. SPLIT(x): Given a key x as input. It generates an array of two columns separating the key from the date. The second column returns the corresponding text date in date format.
  3. CALC(k,ap,m): Given a key (k), an AP code (ap) and a month (m in date format), does the multiplication for cost and days only for months greater or equal than m.

Now we have all we need. We iterate over all rows (lks) from from the FACT table. Each row (lk) contains the key code (k) and the multipliers (mults). We extract this information via DROP and TAKE functions. Now we need to do the total sum for each row. We use REDUCE for that to iterate over the index positions (idx). For each index (i), we obtain the AP code (ap) and the month (m), so we can invoke the CALC function on each iteration and accumulate the result (ac):

ac + INDEX(mults,,i) * CALC(k,ap,m)

Here is the output in S3 cell: output

As per OP additional information provided, TAKE and user LAMBDA function are not available for his Excel version. Here a solution that doesn't use such functions:

=LET(setA, A2:B13, setB, C2:D13, fcts, F2:R4, loc, "[$-0410]mmmm",
 months, EDATE(1,SEQUENCE(1,12,0)), tMonths, UPPER(TEXT(months, loc)),
 lks, DROP(fcts,2), aps, INDEX(DROP(fcts,,1),1,0),idx, SEQUENCE(COLUMNS(months)),
 cA, DROP(setA,,1), cB, DROP(setB,,1),
 kA, LET(a,TEXTBEFORE(INDEX(setA,,1),tMonths),b,
  SUBSTITUTE(INDEX(setA,,1),a,""),HSTACK(a,XLOOKUP(b, tMonths,months))),
 kB, LET(a,TEXTBEFORE(INDEX(setB,,1),tMonths),b,
  SUBSTITUTE(INDEX(setB,,1),a,""),HSTACK(a,XLOOKUP(b, tMonths,months))),
 BYROW(lks, LAMBDA(lk, LET(k, INDEX(lk,,1), mults, DROP(lk,,1),
  REDUCE(0, idx, LAMBDA(ac,i, LET(ap, INDEX(aps,,i), m, INDEX(months,,i),
   fa, FILTER(cA, (INDEX(kA,,1)=ap) * (INDEX(kA,,2)>=m)),
   fb, FILTER(cB, (INDEX(kB,,1)=k) * (INDEX(kB,,2)>=m)),
   ac + INDEX(mults,,i) * SUM(fa*fb))))))))
David Leal
  • 6,373
  • 4
  • 29
  • 56
  • 1
    Hi David, I really appreciate the effort, and I hope it is going to work as it is, even because I have never seen a formula like that. Anyway, the result i want to obtain should be linked to the row (in your screen in the cell S4, not S3) since 0050001643 is variable (i.e. I could have another row with, for example, 0050001642 and consequently other 12 rows in gg_target tables and another total_cost for the people to me recruited in that cost center: 00500% codes are cost center). Did your answer take in account that? Anyway thank you very much – Franco Mar 08 '23 at 09:31
  • The costo_dummy table is the same for every cost center; and yes I can have multiple AP% codes (the fact table actually has more columns: I have also AP01200 and AP01300, so the number of columns is 12*3), but for simplicity I just wrote one code). Thank you again – Franco Mar 08 '23 at 09:57
  • Note2: Probably I wasn't clear, thanks to my bad english, anyway the Note at the end of my question should have been written in this way "Please, note that I want a constant value **for each row of the fact table**, so I am not expecting to use ctrl+shift+enter – Franco Mar 08 '23 at 10:04
  • EDIT: i do have office365 but the TAKE function is missing. – Franco Mar 08 '23 at 10:31
  • 1
    I am going to put green mark even if I cannot test the formula. Anyway I think it's a good starting point for my next studies: let, take, drop, lambda function.... Really really thank you – Franco Mar 08 '23 at 10:46
  • @Franco I will take a look to your comments. If you don’t have `TAKE` you can use `INDEX` instead. – David Leal Mar 08 '23 at 12:50
  • 1
    @Franco, yes it is an array formula, you don't need to expand it down, it works also if you add more rows in the FACT table. Yes, I would say it works if you have more than 12 columns because it iterates over all columns, then finds the information for a given AP and mont. Is `TAKE` the only function you don't have available? If that is the case, it is a quick fix, replacing it with `INDEX`. I would need to know which additional functions you don't have and maybe we can make it works. – David Leal Mar 08 '23 at 14:13
  • 1
    Hi David, you are very kind. It seems I don't have also the CALC function. Actually, i have the italian version so it could be that I have it but I don't know the translation: anyway, after a google search, I didn't find anything (I think it is the correspondent of EVAL/EXEC in some other programming language but even searching in google something like "eval/exec excel italiano" I can't get any clue). Again, thank you! – Franco Mar 09 '23 at 09:48
  • 1
    @Franco such functions are not excel out of the box functions, they were defined by me, I explained in my answer. It helps readability of the formula, and also to repeat the same formulation more than once, for example `SPLIT`. Another alternative is to define them in the Name Manager and then use them. Please try to verify which function you don't have, just create a sample file and test with a simple toy sample, you can take it from Microsoft documentation for that function, so we can list the function you don't have, then I will try to help you to adapt the formula. – David Leal Mar 09 '23 at 13:55
  • 1
    Pardon, I forgot that. Thank you. I'll let you know! – Franco Mar 09 '23 at 13:58
  • 1
    Check at least: `SEQUENCE`, `DROP`, `FILTER`, `BYROW`. We know that you don't have `TAKE`. We can find some workaround for some of them. Please provide this information. By the way you need to educate yourself on how excel works. What you asked in your question requires some advance knowledge and it takes time. I don't want to work in a formula, without knowing the functions you are available, unfortunately Microsoft hasn't documented the functions available for your version. – David Leal Mar 09 '23 at 13:59
  • 1
    Yes, I have all that functions – Franco Mar 09 '23 at 14:03
  • Ok, then the only problem is `TAKE` and the user `LAMBDA` function I created, let me work on it. – David Leal Mar 09 '23 at 14:03
  • @Franco I added at the end of my answer a solution that doesn't use user LAMBDA function and TAKE. Please check. Thanks – David Leal Mar 09 '23 at 14:25