0

Example:

This is my data:

enter image description here

I have three formulas (at D1,E1,F1) which gives the exact same output.

D1 =ARRAYFORMULA(unique(FILTER($A:$A&$B:$B,$A:$A<>"",$A:$A<>0)))

E1 =ARRAYFORMULA(unique(FILTER($A:$A&$B:$B,($A:$A<>"")*($A:$A<>0))))

F1 =ARRAYFORMULA(unique(QUERY({$A:$A&$B:$B,$A:$A},"SELECT Col1 WHERE Col2<>0 AND Col2 IS NOT NULL",1)))

How do I compare the efficiency of these three functions?

Google Sheet (Please click this link to edit the Google Sheet)
Google Sheet (Please click this link to make a copy of the Google Sheet)

Also, I already have given a try to this similar question answer but didn't exactly get a clear answer.
Meaning, I ran each formula for ten times on the same data each output was not only different but had a lot of variations like:

D1 formula gave: 587,618,727,576,456,306,569,314,688,308
E1 formula gave: 574,319,276,553,560,299,553,554,326,552
F1 formula gave: 434,304,291,317,371,301,762,558,268,348

If you see the bold numbers, you'll see there is a difference of almost double time. How could same formula with same data set give such variations?

I ran 10 times each function thrice and all three times average had different values, Query formula average even after variations was slightly less than other two formulas all three times maybe I could conclude Query formula is faster but still it's not a 100% clear conclusion.

Like this, I literally have 100s of more complex formulas in my sheet & every time I come up with a new one I always a substitute one too, but there is no way I could compare any. It's been a really long time that I am trying to find a way for comparison but couldn't find any clear solution yet.

Please help me with a way which provides a clear comparison of the efficiency of Google Sheet formulas.

Aashit Garodia
  • 330
  • 1
  • 11
  • 1
    First, about `I already have given a try to this similar question answer but didn't exactly get a clear answer.`, I deeply apologize for this here. About your question, I have to apologize for my poor English skill. Unfortunately, I cannot understand your question and also I cannot understand `a clear comparison of the efficiency of Google Sheet formulas.`. Can I ask you about the detail of your question? – Tanaike Mar 09 '23 at 08:54
  • @Tanaike I am so glad that you came across my question & please there's no need to apologize, I am sure you would have a solution to my question! – Aashit Garodia Mar 09 '23 at 09:01
  • Thank you for replying. I would like to support you. Because I think that a new approach might be able to be obtained from your question. So, first, I would like to correctly understand your question. – Tanaike Mar 09 '23 at 09:06
  • `a clear comparison of the efficiency of Google Sheet formulas` means nothing but which function is faster as compared to other functions & yes, please be free to ask me any detail of my question. (My English is also not so good, it's not my first language) – Aashit Garodia Mar 09 '23 at 09:08
  • 1
    May be try increasing your data set size a lot and then see if you consistently get better run times with any one formula compared to others. If not, then I guess all your formulas have similar efficiency and any one can be used if it serves the purpose. – ab.it.gcp Mar 09 '23 at 11:37
  • What about the first answer there by kishkin? – TheMaster Mar 09 '23 at 13:32
  • 1
    I'm proposing closing this question in favor of the duplicate linked, so that all benchmarks answers are posted in one question. Any new ideas or answers maybe posted there. This question lacks 1. Clear data from all answers there and 2. As @ab.it.gcp says, the sample size might be too small to provide any valuable information and explain the differences noted. If any one has any objections, let me know. – TheMaster Mar 09 '23 at 13:38
  • Related: https://stackoverflow.com/a/74102047/ – TheMaster Mar 09 '23 at 13:44
  • @TheMaster Hi, I request you to please not close my question because the answer is still not clear to me, and it took a lot of efforts & time for me to come up to this question. I had already mentioned in my question that I had tried this similar question answer, which I understood, but it didn't help. – Aashit Garodia Mar 10 '23 at 09:34
  • Do you mean I should try other answers like the `=BENCHMARKN(LAMBDA(funcToCompare()), 10)` by kishkin or the `The Heterogram & TheMaster benchmark` methods mentioned in this sheet: https://docs.google.com/spreadsheets/d/1LLV1BvhOEimHt8V-Qa_8sTwf0-2hLPkhvWpspOTLGG0/edit#gid=631945707 ? – Aashit Garodia Mar 10 '23 at 09:34
  • @AashitGarodia Your question is still visible to potential viewers. They can add a answer to the linked duplicate(I added one). 1. Yes, you need to try all the answers there. 2. Increase the sample size of the data to say, 100000 rows and compute the results from Tanaike's answer. 3. To reopen this question(I don't recommend it!), this question needs to be different than the original - say, you can ask why are all the results so similar? (The answer would probably be Sample size) If you can edit your question to prove to the community it's not a duplicate, it may be reopened by the community. – TheMaster Mar 10 '23 at 22:52
  • @Tanaike Did you find anything? – Aashit Garodia Mar 16 '23 at 19:38
  • @TheMaster I don't understand either of the answers Heterograph Benchmark or kishkin or TheMaster, But still I tried replicating it. I am getting error on TheMaster & kishkin's formulas. Please have a look: https://docs.google.com/spreadsheets/d/150DQAnH25lK9rsh75XDa9qavtooofoAt9UNbkpmOuvM/edit?usp=sharing . I think I have understood TheMaster solution: `=IF(AND(E1, INDIRECT("D"&F1)<>""),NOW())` In this we assume that first the AND func is completely processed, meaning INDIRECT func is processed no matter how much time it takes & only after its completion, NOW() is processed(). Am I right? – Aashit Garodia Mar 16 '23 at 22:48
  • @TheMaster Question (1): In your formula `INDIRECT("D"&F1)<>""),NOW())` we check the last row why? The result comes to all cells at the same time right? In your sheet https://docs.google.com/spreadsheets/d/1LLV1BvhOEimHt8V-Qa_8sTwf0-2hLPkhvWpspOTLGG0/edit#gid=631945707 Your formula `=IF(AND(B1, INDIRECT("B"&A2)<>""),NOW())` always give same results even when we check the first row itself i.e. `=IF(AND(B1, INDIRECT("B"&4)<>""),NOW())` . Please tell me if I am wrong ? – Aashit Garodia Mar 16 '23 at 23:10
  • @TheMaster Question (2): In Heterogram though I do not understand the whole formula at all at the moment but still what is the use of `0* $A1` if $A1 is multiplying with 0 only, I mean the multiplication is always going to be 0 right? So why don't we just write 0 instead of `0* $A1` ? Also, can you please tell me how can I understand Heterograph Benchmark or kishkin Formulas ? Because I really have no clue about this Lamda or Reduce functions, Please Help! – Aashit Garodia Mar 16 '23 at 23:19
  • @AashitGarodia I assumed the last row is filled last. But, if all of it is filled simultaneously, then we can also use the first row. Regarding Heterogram, I'm not the author. You'll have to ask doubleunary or ask a new question, if interested. If you're having trouble implementing kishkin's formula, you can ask a new question as well. But, like i said, Tanaike's script should give consistent results, if you increase sample size to say, 10000 or 100000. Try increasing sample size first. – TheMaster Mar 17 '23 at 07:16
  • @TheMaster I am not able to implement your formula too, please check the sheet: (TheMaster Benchmark Tab) https://docs.google.com/spreadsheets/d/150DQAnH25lK9rsh75XDa9qavtooofoAt9UNbkpmOuvM/edit?usp=drivesdk . Also I had already increased the sample size please check my sheet, I am not able to use script coz I'm traveling so can't check Tanaike's answer at the moment. Please help me with your formula at least! – Aashit Garodia Mar 17 '23 at 19:37
  • Your formula entry in F4 is incorrect. It is `="=ARRAYFORMULA(unique(FILTER(A:A&B:B,(A:A<>"")*(A:A<>0))))"` It should be something like `'=ARRAYFORMULA(unique(FILTER(A:A&B:B,(A:A<>"")*(A:A<>0))))`. BTW, you can ask a new question on how to implement this(- with clear details of what you need to do and what you did), if you still couldn't do it. That would be a different question and a valid one. So, what if you're traveling? Finish your travel and try Tanaike's script?? – TheMaster Mar 18 '23 at 04:22
  • I tried this: `'=ARRAYFORMULA(unique(FILTER(A:A&B:B,(A:A<>"")*(A:A<>0))))` this also doesn't work, Can you please at least just help me with implementing just your formula, I have literally did the same thing? Yes'll finish my travel tomorrow and try Tanaike's script! – Aashit Garodia Mar 18 '23 at 19:10

0 Answers0