-1

I have a dataframe called 'running_tally'

        list   jan_to  jan_from
0         LA    True      False
1         NY   False       True

I am trying to append new data to it in the form of a single column dataframe called 'new_data'

        list   
0        HOU
1         LA

I concat these two dfs based on their 'list' column for further processing, but immediately after I do that all the boolean values unexpectedly flip.

running_tally = pd.concat([running_tally,new_data]).groupby('list',as_index=False).first()

the above statement will produce:

        list   jan_to  jan_from
0         LA    False      True
1         NY     True     False
2        HOU     NaN        NaN

NaN values are expected for the new row, but I don't know why the bools all flip. What could be the reason for this? The code logically makes sense to me so I'm not sure where I'm going wrong. Thanks

EDIT: I made an edit to 'new_data' to include a repeat with LA. The final output should not have repeats which my code currently handles correctly, just has boolean flipping

EDIT 2: Turns out that when concatenating, the columns would flip in order leading me to believe the bools flipped. Still an open issue however

thePandasFriend
  • 115
  • 2
  • 10
  • 1
    `df1.merge(df2, on='list', how='left')`? – Quang Hoang Jun 30 '20 at 20:18
  • 2
    I don´t get the booleans flipped with your code... Are you doing something more maybe? – MrNobody33 Jun 30 '20 at 20:25
  • @QuangHoang the booleans stay the same for your suggestion but the df1 does not include the row for HOU – thePandasFriend Jun 30 '20 at 20:25
  • @MrNobody33 I do other things before and after, but I narrowed down the issue to that particular concat snippet of code since I print running_tally before and after. Printing before concat, all bools are correct, printing after appends HOU but all bools are flipped – thePandasFriend Jun 30 '20 at 20:32
  • 1
    I cannot reproduce this. It seems you removed something that is critical for this issue – Marat Jun 30 '20 at 20:33
  • 1
    I can't reproduce the bool flipping either, but at one point I thought I had because the column order changed. Just to be sure, is that not what is happening to you? – Ariane Jun 30 '20 at 20:38
  • I will sketch something up in another environment to see if I can reproduce and will post the full code – thePandasFriend Jun 30 '20 at 20:41
  • Turns out using concat would flip the column order because it was automatically sorting. I am looking for a solution now. https://stackoverflow.com/questions/62682582/how-to-concat-pandas-dataframes-without-changing-the-column-order-in-pandas-0-20 – thePandasFriend Jul 01 '20 at 17:51

3 Answers3

1

I am not sure why you want to use a groupby in this case... when using concat there is no need to specify which columns you want to use, as long as their names are identical. Simple concatenation like this should do:

running_tally = pd.concat([running_tally,new_data], ignore_index=True, sort=False)

EDIT to take question edit into account: this should do the same job, without duplicates.

running_tally = running_tally.merge(new_data, on="list", how="outer")
Ariane
  • 326
  • 2
  • 11
1

I don´t get the booleans flipped as you, but you can try this too:

running_tally=running_tally.append(new_data,ignore_index=True)
print(running_tally)

Output:

  list jan_to jan_from
0   LA   True    False
1   NY  False     True
2  HOU    NaN      NaN

EDIT: Since the question was edited, you could try with:

running_tally=running_tally.append(new_data,ignore_index=True).groupby('list',as_index=False).first()
MrNobody33
  • 6,413
  • 7
  • 19
0

The actual row order was being flipped when using concat for pandas 0.20.1

How to concat pandas Dataframes without changing the column order in Pandas 0.20.1?

thePandasFriend
  • 115
  • 2
  • 10