1

Given an integer p and a destination base b, return a string representation of p in base b. The string should have the least significant bit at the end

^ This is the problem I'm giving myself.

The naive recursive algorithm (in C++) I came up with is as follows:

string convertIntToBaseRecursive(int number, int base) {
  // Base case
  if (!number) return "";

  // Adding least significant digit to "the rest" computed recursively
  // Could reverse these operations if we wanted the string backwards.
  return convertToBaseRecursive(number / base, base) + to_string(number % base);
}

While the algorithm is incredibly simple, I want to make sure I understand the complexity breakdown. My thoughts are below and I would like to know if they are correct, or wrong, and if they are wrong then knowing where I'm off track would be nice!

Claim:

  • n = logb(p) is the length of return string
  • Time complexity: O(n^2)
  • Space complexity: O(n)

Reasoning:

In order to keep the least significant bit at the end of a string when it is the value we calculate before anything else, we'd either have to:

  1. Compute the string recursively as we are
  2. Keep "shifting" the array every time we calculate a bit so we can add the most recent bit to the front of the string, not the end
  3. Write the string backwards, and reverse it before we return (most efficient)

We're doing the first method in the above C++ algorithm, and the + operator creates a new string at each stack frame. The initial frame creates and returns a string of length n, the next frame creates a string of length n-1, n-2, n-3, and so on. Following this trend (without going into a proof of why 1 + 2 + 3 ... + n = O(n^2), it is clear the time complexity is O(n^2) = O(logb^2(p)). We'll also only need to be storing O(n) things in memory at any time. When the original stack frame resolves (just before algorithm completes) we'll have the memory in terms of a raw string, but before it resolves it will be in terms of a single character (O(1)) + recursive stack frames (O(n)). We do this at each level storing n amounts of single characters until we complete. Therefore the space complexity is O(n).

Of course the more efficient version of this solution would be

string convertIntToBaseIterative(int number, int base) {
  string retString = "";

  while (number) {
    retString += to_string(number % base);
    number /= base;
  }

  // Only needed if least significant
  reverse(retString.begin(), retString.end());
  return retString;
}

I believe this above solution , where n = logb(p) has:

  • Time complexity: O(n)
  • Space complexity: O(n)

Are these analysis correct or am I off somewhere?

Dominic Farolino
  • 1,362
  • 1
  • 20
  • 40

2 Answers2

0

Since the return value has to contain the output, you cannot get a better space complexity than O(n).

Suppose the output string is composed of the following digits in order: a_1, a_2, a_3, ..., a_n. In the recursive approach, we create a string as follows:"a_1" + "a_2" + .... + "a_n". In the iterative approach, we do: (((...(a_1) + a_2) + a_3) + ... + a_n)))...). Hence the time complexity in both cases should be the same at O(n^2) (in C++03. See note below for C++11).

As you see, both approaches have been heavily influenced by implementation details. The string type is not very useful for operations that involve repeated concatenation. If you have a preallocated array of size n, you could get the complexity down to O(n).

Note 1: There is some detail about the append operation. In C++03, the append operation had no specified complexity and could lead to Copy-On_write (if the string could not be extended in place and a relocation was needed). In C++11 CoW and rope-style implementations are disallowed and append should lead to amortized O(1) time per character. Hence in the C++11 case, we should be able to get O(n) time complexity for both implementations.

Note 2: To get a O(n) time complexity with user-defined string implementation (which contains the length), the string needs to be passed by reference in the function. This will lead to the function signature getting changed to:

void convertToBaseRecursive(int number, int base, MyString& str)

This implementation will let the string be shared and updated in-place provided the string uses an array that is pre-allocated.

user1952500
  • 6,611
  • 3
  • 24
  • 37
  • Why is the iterative approach `O(n^2)`? My understanding is that the `+=` operator in later versions of C++ (possibly earlier too?) acts as a vector and doubles capacity each time making each `+=` call amortized `O(1)`, which when done `O(n)` times = `O(n)` Time complexity right? – Dominic Farolino Jan 15 '17 at 02:39
  • You are right if the size doubles. However, the number and order of building the string should be identical in both approaches. In the recursive approach, we will not start building the string until the whole recursion has been unwound. And when it is completely unwound, we will evaluate it left to right leading to the same identical set of operations. – user1952500 Jan 15 '17 at 02:46
  • I'll update the answer based on the algorithm used for concatenation after I take a look at it. – user1952500 Jan 15 '17 at 02:46
  • BTW, a `+=` is not free because, unless the length of the string is maintained, there is a loop to get to the end of the string. That leads to the `O(n^2)`. – user1952500 Jan 15 '17 at 02:48
  • Ahh good call. So with a string, there's really no way to achieve better than quadratic (really `O(logb^2(p))`) time. A stack would be nice, and then we could take the length of the stack, and fill a pre-allocated string (or something) with its contents. Thanks! – Dominic Farolino Jan 15 '17 at 02:54
  • Also check the note that I added. In C++11, this is optimized. – user1952500 Jan 15 '17 at 02:56
  • Assuming you got some of that info from this (http://stackoverflow.com/a/15082239/3947332) answer - check out the last para. Does this mean that any optimizations it makes are nullified since append "has to walk the length.....", seemingly making it an `O(n)` operation anyways? – Dominic Farolino Jan 15 '17 at 03:02
  • Also I still disagree that the recursive algorithm could ever get to `O(n)`. This is because the `+` operator used creates a new string every time, taking `O(n)` time for each string. Since we're creating `n` strings (with decreasing size), it can never get any faster than `(n^2)/2 = O(n^2)` time right? – Dominic Farolino Jan 15 '17 at 03:04
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/133175/discussion-between-user1952500-and-dom-farolino). – user1952500 Jan 15 '17 at 03:05
0

Note:

Given the chat room conversation with @user1952500 I had a couple edits to make to his answer based on what we talked about. The following is an edited version of his answer reflecting the latest of what we talked about and what I learned:


Edited answer:

Since the return value has to contain the output, you cannot get a better space complexity than O(n).

Suppose the output string is composed of the following digits in order: a_1, a_2, a_3, ..., a_n. In the recursive approach (bullet #1), we create a string as follows "a_1" + "a_2" + .... + "a_n", which with recursion yields O(n^2) time complexity. In bullet #2, the iterative approach constantly pushes characters to the front of the string like (((...(a_1) + a_2) + a_3) + ... + a_n)))...) which shifts the entire string on every character addition also yielding O(n^2) time complexity. On your written iterative approach (bullet #3) the time complexity can be optimized depending on the version of C++ (See notes below).

The string type is not very useful for operations that involve repeated concatenation. In older versions of C++ you could achieve O(n) time complexity by preallocating a string of size n. In C++11 this answer indicates that certain append operations can be optimized to be amortized O(1) for a single character. Assuming this is true the written out iterative version will have O(1) time complexity without any extra work.

Note: To get O(n) time complexity with the recursive version of this algorithm, we could take advantage of the amortized O(1) character appends and use a single string passed by reference. This would entail the recursive version's function signature to be re-written as follows:

void convertToBaseRecursive(int number, int base, string& str)

Community
  • 1
  • 1
Dominic Farolino
  • 1,362
  • 1
  • 20
  • 40
  • You can click the "edit" button on the first answer. (Then s/he can approve the edits.) – Potatoswatter Jan 15 '17 at 04:16
  • @Potatoswatter only reason I didn't is because I had too many edits to make thus making it a lot of my own wording. Should I still do that? I felt that edits were too substantial – Dominic Farolino Jan 15 '17 at 04:19
  • It can go either way. They still look similar at a glance. Also you can ask user195 to enable "community wiki" mode on the answer which opens it to free editing. – Potatoswatter Jan 15 '17 at 07:09