Basically, this problem is similar to calculating the project's total duration (in days). Where in this case each campaign's data can be assimilated as project tasks with start and end dates. The only difference here is that gaps don't contribute to project duration.
Let's illustrate it with the following example, since dates are represented in Excel as a positive integer, to simplify the calculation we can use just numbers as follow (Sample Data 1):

From the graph is easy to identify that the total duration will be 13
days. Notice that the gap in columns K
and P
doesn't count for the total duration.
Based on this graphical representation we can build a logic for calculating the durations as we progress from the top of the Gantt chart to the bottom.
The logic to calculate the duration is based on identifying overlapping duration for each task. The condition to identify two intervals A
,B
overlap is the following:
AND(startA <= endB, endA >= startB)
we are going to iterate over all tasks and on each step update the new start and end dates and update the duration. Let's say we are going to calculate the new start and end dates for two consecutive rows represented by A
and B
intervals. The combined interval will be:
[MIN(startA, startB), MAX(endA, endB)]
If the previous duration is represented by the variable duration
, then the updated duration will be:
IF(endB > endA, duration + (endB - endA), duration)))
i.e, if the next interval (B
) ends after A
, then we need to update the duration with the difference (endB - endA
).
If the intervals don't overlap, then the duration will be updated as follow:
duration = duration + endB - startB + 1
and the next start and end date will be the startB
, endB
for the next calculation.
The above process can now be put in Excel terms as follow:
=LET(set, $A$2:$C$12, campaing, T2, campaings, INDEX(set,,1),
subset, SORT(FILTER(set, campaings=campaing),2),starts, INDEX(subset,,2),
ends, INDEX(subset,,3),intervals, starts&","&ends,
SPLIT, LAMBDA(x, 1*TEXTSPLIT(x,",")), JOIN, LAMBDA(y, TEXTJOIN(",",,y)),
calc, REDUCE("0,0,0", intervals, LAMBDA(acc,i, LET(accValues, SPLIT(acc),
startA, INDEX(accValues,1,1), endA, INDEX(accValues,1,2),
duration, INDEX(accValues,1,3), values, SPLIT(i), startB, INDEX(values,1,1),
endB, INDEX(values,1,2),
IF(duration=0,JOIN(HSTACK(startB, endB, endB-startB+1)),
IF(AND(startA <= endB, endA >= startB),
JOIN(HSTACK(MIN(startA, startB), MAX(endA, endB),
IF(endB > endA, duration + (endB-endA), duration))),
JOIN(HSTACK(startB, endB, duration + endB-startB+1))
))))), INDEX(SPLIT(calc),1,3)
)
and here is the output:

Back to the original input data from the question for both campaigns, the output will be:

Formula Explanation
Now that we know the logic to calculate the total duration, let's explain the excel implementation.
We use LET
function to define the required variables and to avoid repeating the same calculation.
In order to be able to build the logic explained before, we need to sort the input data based on the start date.
For the calculation, we need to use an Excel function that allows using a cumulative calculation based on the previous iteration. We can use it for that REDUCE
or SCAN
. We use REDUCE
because at the end we need a scalar number with the total, not an array, but SCAN
is a good option for debugging purposes. In order to do that you can replace REPLACE
with SCAN
and the following line at the end: INDEX(SPLIT(calc),,3)
with: calc
.
For Sample Data 1 this will be SCAN
output, so you can check in each iteration the CVS values stored in the accumulator (start, end, duration)
1,2,2
1,4,4
1,4,4
5,6,6
7,7,7
9,10,9
11,11,10
11,11,10
12,12,11
14,14,12
14,15,13
Because on every iteration, we need to track three values: start date, end date, and duration, but REDUCE
function iterates over a single element of the input array. We are going to build a record in string format via Comma-separated-value format (CSV) and then extract the information back in every iteration. In order to do that we define the following two user LAMBDA
functions:
SPLIT, LAMBDA(x, 1*TEXTSPLIT(x,","))
JOIN, LAMBDA(y, TEXTJOIN(",",,y))
Notice because we are dealing with numbers we ensure the SPLIT
function returns numbers, not texts as it is the output of TEXTSPLIT
. We do the conversion by multiplying the output of TEXTSPLIT
by 1
.
The accumulator (acc
) was initialized as all values equal to zero (0,0,0
), so for the first iteration we just assign the information from the first row (start, end, and duration), then we iterate following the logic explained before and use variables to extract back the information stored in CSV format.
Array Version
Here is the formula for the array version, that returns the unique days for all the campaigns:
=LET(set, $W$2:$Y$21, campaings, INDEX(set,,1), campaingsUx, UNIQUE(campaings),
SPLIT, LAMBDA(x, 1*TEXTSPLIT(x,",")), JOIN, LAMBDA(y, TEXTJOIN(",",,y)),
byRowResult, BYROW(campaingsUx, LAMBDA(campaing, LET(subset,
SORT(FILTER(set, campaings=campaing),2),starts, INDEX(subset,,2),
ends, INDEX(subset,,3),intervals, starts&","&ends,
calc, REDUCE("0,0,0", intervals, LAMBDA(acc,i, LET(accValues, SPLIT(acc),
startA, INDEX(accValues,1,1), endA, INDEX(accValues,1,2),
duration, INDEX(accValues,1,3), values, SPLIT(i),
startB, INDEX(values,1,1), endB, INDEX(values,1,2),
IF(duration=0,JOIN(HSTACK(startB, endB, endB-startB+1)),
IF(AND(startA <= endB, endA >= startB),
JOIN(HSTACK(MIN(startA, startB), MAX(endA, endB),
IF(endB > endA, duration + (endB-endA), duration))),
JOIN(HSTACK(startB, endB, duration + endB-startB+1))))
))), INDEX(SPLIT(calc),1,3)
))), HSTACK(campaingsUx, byRowResult)
)
And the output will be:
campaign_a 177
campaign_b 177