17

A friend and me had a big discussion about a single function, the function itself does not make any sense, but it is false in my view.

The function is the following:

//get tomorrows date
int getTomorrowsDate(){
   sleep(1*60*60*24);
   return getCurrentDate();
}

If i execute the function in and we get the result, it is already wrong as the tomorrow day already got today.

After discussing this a long time, my friend is at the position that a function is correct on execution time and i am in the opposite and say a function is correct at the resulting time. May some one please elaborate to me why my view is wrong as i do not understand it.

klutt
  • 30,332
  • 17
  • 55
  • 95
snapo
  • 684
  • 8
  • 23
  • To me it sounds like a following though experiment: Your friend have written "Today is " on a piece of paper, and he is claiming - "I have written a true statement!". And he is right. You look at this paper the next day and you are saying "No, you have written a false statement!" - and you are right too. Or you are both wrong, as the truth of the statement is depending on hidden premises. – Eugene Sh. Jul 16 '20 at 18:58
  • This is an example of the principle that _external dependencies_ (such as the system clock) should be treated as parameters. "Tomorrow" isn't a meaningful concept for a timeless mathematical function; instead, "input-date plus one day" is more expressive. (Obviously, at some edge point you'll do something like `clock.now()`, but at that point you are explicitly fixing the reference.) – chrylis -cautiouslyoptimistic- Jul 17 '20 at 04:11
  • I don't think I fully understand the problem. From my point of view the function is simply incorrectly named/described. It does something else than expected. Also I don't know what execution and result times are, and what's the difference between them. – freakish Jul 17 '20 at 06:42

4 Answers4

18

As David Schwartz says, status reporting operations such as getting disk free space, getting file size, checking if a file exists, etc., are fundamentally unreliable. A good way to think about them is that they return good faith estimates of their measurements, but there's the caveat that callers must not rely upon the values being correct as they could change at any time before, after, or during the measurement.

There's also the matter of requirements, both stated and unstated. Pretty much every function you write will have an unstated performance requirement that it not take 24 hours to execute. Certainly that's true for getting a date. What does it matter what you call the result of this function when the function is completely unusable? It's a purely academic distinction. In practical terms, the function is broken, and no result is ever correct.

To go further, I'm not even sure it's an academic distinction. You're asking if the "execution time" or the "result time" is correct. To be "correct" implies that the distinction will lead to either a right or wrong action and you need to know which it will be. What would those actions be?

If I'm right then I would do X, but if my friend is right I'd do Y.

What are actions X and Y? What would you do differently based on how this argument is resolved? For example, if this were a real issue in a ticketing system, would you end up closing the issue based on the outcome of your argument—neither of you fixing the 24-hour sleep? I hope not!

John Kugelman
  • 349,597
  • 67
  • 533
  • 578
  • 2
    I am saying this function is usable, and the only question here is about its naming. Or I am completely missing its point. Name it "Wait24HrsAndReturnTheDate" and the question is gone. – Eugene Sh. Jul 16 '20 at 18:34
  • Thank you very very much for a explanation i understand (or i think at least to understand it). @EugeneSh.this would be like changing the code, the functions name has to stay so ;=) – snapo Jul 16 '20 at 18:38
  • @snapo My point is that your question seems to be about ambiguity of the function name - as it might not be reflecting its real functionality. Since it is your question, you might know better what it is about though. – Eugene Sh. Jul 16 '20 at 18:40
  • "*Pretty much every function you write will have an unstated performance requirement that it not take 24 hours to execute.*" - Is this true for `main()` too? Because when I let my computer turned on and don't exit anything, the application will work if I come back in 24hrs. – RobertS supports Monica Cellio Jul 16 '20 at 18:41
  • 4
    Damn, I'd hoped adding "pretty much" would prevent that particular nit from being picked. – John Kugelman Jul 16 '20 at 18:51
  • 1
    Can't help but think about the [sleep sort](https://stackoverflow.com/questions/6474318/what-is-the-time-complexity-of-the-sleep-sort) – Eugene Sh. Jul 16 '20 at 19:03
  • I don't agree that it helps to consider status reporting ops as good faith estimates. They are *exactly* correct (not estimates) at the time the measurement is made. It is the fact that the status can continually vary that is important. – Steve Kidd Jul 21 '20 at 21:55
  • @SteveKidd What if measuring isn't atomic? For example, calculating disk usage by adding up the sizes of files in a directory. – John Kugelman Jul 21 '20 at 23:56
  • @JohnKugelman I assume that, at kernel level, atomic measurements/actions are needed. E.g. remaining disk space can be reserved for a file or it can't. It must not be possible for two processes to allocate the same space. E.g. multiple processes writing to a single file would need atomic updates to file size. That's why I expect stats to be exact at the point they are made in the kernel, because I expect them to need to be atomic. Of course, amalgamated data (perhaps listing the files in a specified directory) meet your good faith estimate statement so I understand and accept your point. – Steve Kidd Jul 24 '20 at 18:05
8

A status-reporting function's result is correct if its result was valid at at least one time in-between when it was called and when it returns. This is the case for all status-reporting functions such as getting disk free space, getting file size, and so on.

At some point it has to get the status, and it can't do anything about the status changing before it gets it or after it gets it. So this has to be the rule or it would be impossible to write correct functions.

People often get this wrong. For example, they check for free space on a disk and then assume a subsequent write won't fail due to insufficient space. Or they call select and then assume a subsequent operation won't block. These are all mistakes.

David Schwartz
  • 179,497
  • 17
  • 214
  • 278
  • If just before getCurrentDate() returns (ie. is on the "ret" instruction), the process is swapped out, and does not resume for 24hours, did it ever produce the correct result? – mevets Jul 16 '20 at 18:02
  • @mevets Yes, because that was the current date at some point in-between when the function was called and when it returns. – David Schwartz Jul 16 '20 at 18:03
  • With that said, the approach of the function presented in the question is terrible, even if it were corrected for the factor-of-1000 error, to the point of making the function useless. That is also what gives rise to the dispute the OP describes. – John Bollinger Jul 16 '20 at 18:04
  • The reason why i personally not aggree is from the code perspectiv the sleep would be blocking till the getcurrentdate() and therefore delivers the wrong date. Is this assumption wrong even if it delivers the wrong date? – snapo Jul 16 '20 at 18:07
  • @snapo Suppose it was instead implemented as `int tomorrow = getCurrentDate() + 1 day; sleep(1 day); return tomorrow;`. Now some internal variable inside the function contains the "right" answer at some point in time, but the visible result from outside the function is exactly the same, so this implementation is equally (in)correct as the original one. – Thomas Jul 16 '20 at 18:12
  • @Thomas , for this case you are right, but the code i execute has a getCurrentDate() function which is never triggered on todays date. So it would be wrong till the next day executed and because it is already the next day the result would also be wrong. I still try to get my head around David-Schwarz answer. I understand completely requesting diskspace never gives a accurate result, same with a requesting time it is never accurate. But already initially being wrong is maybe too strange for my brain... – snapo Jul 16 '20 at 18:16
  • 2
    @DavidSchwartz As OP already pointed in the prev. comment, it would really be better if you focus more on the given example instead of to be only so generic here. Just my personal thought. – RobertS supports Monica Cellio Jul 16 '20 at 18:18
  • 1
    @RobertSsupportsMonicaCellio The example is, at least to me, much less interesting than the general principle. The application of the principle to the example seems quite trivial to me. I find it very hard to believe that anybody particular cares about this particular example since it's so unrealistic. I have to think the principles that it illuminates are what matters to any reasonable person. – David Schwartz Jul 16 '20 at 20:33
4

Consider this pseudo code:

fun getTomorrowsDate()
   sleep(getRandomValue())
   today = getCurrentDate() 
   sleep(getRandomValue())
   ret = today + 1
   sleep(getRandomValue())
   return ret

This is not very far from what is actually happening during EVERY function call. The operating system might interrupt at any time, so in a sense, those sleep calls actually do exist.

So unless you have taken very cautious steps of making your function atomic, the only difference between the above pseudo and your code example is that you have ensured an event that always have a non-zero probability to happen to have 100% probability.

David and John gave good answers, so I will not elaborate more. I just wanted to add this example.

klutt
  • 30,332
  • 17
  • 55
  • 95
0

IMO opinion, this function has the semantics of getCurrentDate and could keep that name. Because getCurrentDate is expected to return some date which is guaranteed to occur between the moment of the call and the moment of the return. (Think of what happens if you call the standard function ten nanoseconds before midnight, so that the date changes during the call.)


By the way, I don't even know if the existing implementations enforce the above "betweenness" requirement. (For example, the rule could break if the function attempted to compensate its own latency.)