0

I am applying a function on a dataframe df and that function returns a dataframe int_df, but the result is getting stored as a series.

df

  limit
0  4

new_df

  A  B
0 0  Number
1 1  Number
2 2  Number
3 3  Number

This is a pseudocode of what I have done:

def foo(x):
    
    limit = x['limit']
    int_df = pd.DataFrame(columns=['A', 'B']) # Create empty dataframe

    # Append a new row to the dataframe
    for i in range(0, limit):
        int_df.loc[len(int_df.index)] = [i, 'Number']

    return int_df # This is dataframe

new_df = df.apply(foo, axis=1)
new_df # This is a series but I need a dataframe

Is this the right way to do this?

Animeartist
  • 1,047
  • 1
  • 10
  • 21
  • 1
    you're getting a series of dataframe. You're applying the function to each row, and you're returning the dataframe for each row. – Nk03 Jul 28 '21 at 18:13
  • could you [explain](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples) what are you trying to do? you don't need `apply` to append a rows – Danila Ganchar Jul 28 '21 at 18:13
  • @DanilaGanchar, I have added an explanation. I need to take value from `df`, and use it inside function `foo()` to create another dataframe called `int_df`. `foo()` is supposed to return a dataframe but it returns a series. – Animeartist Jul 28 '21 at 18:19

1 Answers1

0

IIUC, here's one way:

df = df.limit.apply(range).explode().to_frame('A').assign(B='number')

OUTPUT:

  A  B
0 0  Number
1 1  Number
2 2  Number
3 3  Number
Nk03
  • 14,699
  • 2
  • 8
  • 22
  • Thank you for the answer, but this is a pseudocode of my original code. So I need that function to return a dataframe. – Animeartist Jul 28 '21 at 18:22
  • @Animeartist Seems like you just wanna explore the dataframe based on the 'limit' column and add a new column. Which can be done via range/explode without any function. you can update your sample `inp/out` dataframe to be more specific about the requirement. – Nk03 Jul 28 '21 at 18:24
  • Sorry, actually few other operations are also performed to get the `int_df`. All of those steps are not shown in the post. – Animeartist Jul 28 '21 at 18:25
  • 1
    Ok. So, there's one more option that may work for you. You can initialize an empty list. Then, instead of returning the df, You can add the `df` to that list. After that, use `pd.concat`. This way you'll get the dataframe. – Nk03 Jul 28 '21 at 18:28
  • Problem is, the apply function is returning a series and not a dataframe. So we cannot perform pd.concat() in the end. – Animeartist Jul 28 '21 at 18:33
  • 1
    `pd.concat` works on Series as well as DataFrames. – Henry Ecker Jul 28 '21 at 18:55
  • 1
    @Animeartist you can try `pd.concat(df.apply(foo, axis=1).tolist())` maybe – Ben.T Jul 28 '21 at 19:02
  • 1
    @Ben.T, Thankyou so much for the answer. This is exactly what I was looking for! – Animeartist Jul 28 '21 at 19:39