2

I'm trying to find out if the following code piece is good or bad practice. It's about a html query string that should be parsed by my API. It's very convenient to use recursion to trim an arbitrary amount of '?' off the query string.

However, I'm wondering if this could potentially result in a stack overflow due to the uncontrollable recursion depth. My hope is that such cases are guaranteed to be tail-optimized but I'm not sure about it. Is there such guarantee?

Demo

#include <string_view>
#include <cstdio>

static auto digest_query(std::string_view query) -> void
{
    if (query.front() == '?') {
        // printf("%.*s\n", (int)query.size(), query.data());
        return digest_query(query.substr(1));
    }
    // Do other stuff...
}

int main()
{
    digest_query("???????key=value");
}
463035818_is_not_an_ai
  • 109,796
  • 11
  • 89
  • 185
glades
  • 3,778
  • 1
  • 12
  • 34
  • Looks like a better place to use a conventional loop to me. `size_t end = 0; for ( ; query[end] == '?'; end++) {} query = query(end); //Do other stuff` – user4581301 Feb 10 '23 at 08:07
  • No, for this use case, I would not recommend recursion. If it is a school project, then maybe. But never in any productive code. This is unfortunately neither secure nor safe. Also, not efficient for such a simple task. – A M Feb 10 '23 at 08:08
  • 2
    any code that is could be tail-recursion optimised (note that there is no guarantee this'll happen) can be replaced with a loop. https://stackoverflow.com/questions/34125/which-if-any-c-compilers-do-tail-recursion-optimization – Alan Birtles Feb 10 '23 at 08:09
  • 3
    There are no such guarantees in C++. It's also a bit tricky to ensure that function calls are in tail position in C++ due to the presence of destructors. – molbdnilo Feb 10 '23 at 08:17
  • If some code is tail-recursive AND it has a reachable end condition that reliably keeps the recursion finite AND if the compiler takes full advantage of that, then the code is likely to be translated into an iterative form - which would not be subject to problems of stack overflow. If you have reason to worry about such things, implement a loop to explicitly handle/trim repeated characters (e.g. a loop to strip all consecutive `'?'`) - which is feasible whether your code is recursive in other scenarios or not. – Peter Feb 10 '23 at 08:23
  • Asking about "good design" is in the grey area around "opinion-based". You might want to instead ask if your hope is well-grounded. (But don't bet too heavily on it. Last I heard, the only guaranteed optimizations are forms of copy elision.) – JaMiT Feb 10 '23 at 08:26
  • convenience is [`std::find` or `std::find_if_not`](https://en.cppreference.com/w/cpp/algorithm/find) – 463035818_is_not_an_ai Feb 10 '23 at 08:26
  • If you want opinions then auto foo(Blah blah) -> void {} looks more obfuscation than convenience. – Öö Tiib Feb 10 '23 at 08:28
  • @JaMiT actually I don't think its grey. I say its bad design, someone else might say good, and thats fine. "good" is opinions – 463035818_is_not_an_ai Feb 10 '23 at 08:30
  • 1
    If my edit is ok, then this looks like a duplicate: https://stackoverflow.com/questions/34125/which-if-any-c-compilers-do-tail-recursion-optimization edit: or not, because its only about the possibility not about the guarantee – 463035818_is_not_an_ai Feb 10 '23 at 08:32
  • funny, top answer suggests "An easy way to check if the compiler did the optimisation is to perform a call that would otherwise result in a stack overflow — or looking at the assembly output." =) – 463035818_is_not_an_ai Feb 10 '23 at 08:37
  • Although there are no explicit rules about when you can and can not use recursions, there are rules about when you should or shouldn't use recursions. This particular scenario does not seem to require a recursion, it can be done 10 different ways. As some users already suggested, tail-recursion optimized code can (and most of the times should) be replaced by a loop. – Dimitar Feb 10 '23 at 08:38
  • 1
    @463035818_is_not_a_number I am willing to grant "grey" in this case because the author did single out a concern about stack overflow. If it was just about good/bad design, yes it is opinion-based. But when the question comes with an objective criterion for good/bad, there might be a basis for answers that are more than just opinion-based. In this case... well, the question's been changed, so the issue is moot. :) – JaMiT Feb 10 '23 at 08:38
  • 1
    You may want to present your code on codereview.stackexchange.com, too. It looks horribly inefficient to me to repeatedly copy that string. – Ulrich Eckhardt Feb 10 '23 at 08:54
  • @AlanBirtles Not just tail recursive code: *all* recursion can be replaced by a loop. This is a fundamental theorem of computation. But *should* it be done? Does it lead to better code? Often, no. – Konrad Rudolph Feb 10 '23 at 08:55
  • @KonradRudolph OK, to be more explicit, tail recursive code can be replaced by a simple for loop, other recursive patterns might need the use of a stack (essentially emulating the recursion) – Alan Birtles Feb 10 '23 at 10:12

2 Answers2

2

Is there such guarantee?

No there isn't.

n. m. could be an AI
  • 112,515
  • 14
  • 128
  • 243
0

The recursion depth is not "uncontrolled", it is exactly equal to the number of leading '?'. Every level will host a string variable on the stack, but the string data itself is allocated on the heap.

So there is absolutely no risk of overflow, but this code is so inefficient ! It will involve - recursive calls, - string allocation/deallocations, - string copies. All of this perfectly useless. I call it disaster.

Should I add that I find it pretty unreadable (so unexpected) compared to a regex query or a straightforward loop ?

  • "it is exactly equal to the number of leading '?'" Which is uncontrolled. "string allocation/deallocations" There are no string allocations or deallocations in this fragment. – n. m. could be an AI Feb 10 '23 at 09:32
  • @n.m.: wrong: substring returns a newly constructed string object with its value initialized to a copy of a substring of this object. –  Feb 10 '23 at 09:36
  • There are no strings here, only string views. – n. m. could be an AI Feb 10 '23 at 09:47
  • @n.m.: I missed that, my bad. That lessens my diatribe. –  Feb 10 '23 at 10:07
  • The recursion is uncontrolled in a sense that this walks over user-input. And the user could test the guts of this API by providing some arbitrary large ??? string in theory. There's no string allocation, just forwarding of some iterators/pointers (string_view also contains a member function substr). Still the recursive calls could be potentially slow. Idk why you find that unreadable though? – glades Feb 10 '23 at 10:30
  • @glades: I don't believe in the arbitrary large theory. It takes significant effort to figure out that processing starts by skipping the '?', not counting that this is followed by "other stuff" executed each time. –  Feb 10 '23 at 10:35
  • @YvesDaoust What you are describing as “significant effort” is a basic exercise for a penetration tester (or attacker). If the API accepts user input, it must not rely on security by obscurity of implementation details. Implementation details can be leaked, or inferred through probing with different input. – Konrad Rudolph Feb 10 '23 at 12:48
  • @KonradRudolph: my remark has nothing to do with security. Are you paranoid ? –  Feb 10 '23 at 12:56
  • @YvesDaoust But *this is a security issue* (because it is a potential exploit vector)! Anyway, if you were not talking about security, what does the "significant effort" refer to, then? – Konrad Rudolph Feb 10 '23 at 13:11
  • @KonradRudolph: I said readability. –  Feb 10 '23 at 13:11
  • 1
    @YvesDaoust Oh I see. Alright. I agree with this point actually. I thought your entire comment was responding to the "arbitrary large [user] input". At any rate, my comment stands as a response to *that part*: this is a potential exploit vector if user input can crash the application or cause UB via stack overflow. – Konrad Rudolph Feb 10 '23 at 13:13
  • *"So there is absolutely no risk of overflow"* -- not even if the string were to consist of `std::string_view::max_size()` question marks? How big is the call stack allowed to get on your system? – JaMiT Feb 11 '23 at 03:19
  • @JaMiT: I always set it to `std::string_view::max_size()+1`. –  Feb 11 '23 at 14:39