Trying to concatenate dataframes with numpy and recursion

Question

I have a list of pandas dataframes, which I am trying to concatenate into one dataframe using recursion and numpy.

def recur(framelist, index=0, result=0):


    if index == len(framelist)-1:
        return result

    else:
        return recur(np.concatenate((framelist[index],framelist[index+1])))

My intention with the above is to pass the dataframe list to the recur function. The base case is when the end of the list is reached. the functionality is to concat all pairs of dataframes

However I get an error that 0 dimensional arrays cannot be concatenated

First, why do you want to use recursion instead of just concatenating using a straighforward pd.conact([...]) ? And second where do you call `recur` in your `recur`? — SomeDude, May 04 '22 at 22:04
I just want to practise/ challenge myself. Sorry that was a typo, where I state frame that is recur in my code. Edited the question — Prolle, May 04 '22 at 22:09
Be aware of `maximum recursion depth`: https://stackoverflow.com/questions/3323001/what-is-the-maximum-recursion-depth-in-python-and-how-to-increase-it — MSH, May 04 '22 at 22:12
1) recursion, while possible in Python, is not preferred. 2) you show a funciton, but not a [mcve]. 3) you mention an error, but don't show the full error, with traceback. In other where does the error occur, and on what inputs? — hpaulj, May 05 '22 at 04:49

pcoates · Accepted Answer · 2022-05-05T17:15:30.560

To work out what's going on it's a good idea to walk through it step by step.

You say your initial call to recur passes in a list of panda dataframes. You don't show the creation of them, but let's say they're something like...

framelist = [
    pd.DataFrame(np.array([1, 2, 3])),
    pd.DataFrame(np.array([4, 5])),
    pd.DataFrame(np.array([6, 7])),
    pd.DataFrame(np.array([8]))
    ]

So, first time through it concatenates the first two entries from framelist as numpy arrays.

[[1], [2],  [3]]  and [[4], [5]]

This will result in a numpy ndarray which looks like:

[[1], [2], [3], [4], [5]]

This result is passed into recur() as the new framelist

Second time through it concatenates the first two entries from framelist.

[1] and [2]

This will result in a numpy array which looks like:

[1, 2]

This result is passed into recur() as the new framelist

Third time through it concatenates the first two entries from framelist.

1 and 2

These are simply numbers, not arrays, so you see the error '0 dimensional arrays cannot be concatenated'

Here's an example of how to do the concatenation with recursion. You don't need to keep track using any kind of index parameter. Just keep taking the first off the list and pass the remainder into recur. When you get to the point where there's only 1 left in the list, that gets passed back up and concatenated with the previous one. The result is passed back up and concatenated with the previous one, etc.

def recur(framelist):
    # keep going until there's just 1 left.
    if len(framelist) == 1:
        return framelist[0]

    return np.concatenate((framelist[0], recur(framelist[1:])))

print(recur(framelist))

@Profile, I've just added an example of code that would do concatenate of a list using recursion, just in case it's useful. Although, like others have already said, it's not a good solution for this problem. In fact recursion rarely is the answer. — pcoates, May 05 '22 at 17:17

Trying to concatenate dataframes with numpy and recursion

1 Answers1