0

I have the following snippet of code, which results in warnings after updating to pandas==1.4.3:

        df_phys = pd.DataFrame()
        for db in self.db_list:
            df_decoder = can_decoder.DataFrameDecoder(db)

            df_phys_temp = pd.DataFrame()
            for length, group in df_raw.groupby("DataLength"):
                df_phys_group = df_decoder.decode_frame(group)
                df_phys_temp = df_phys_temp.append(df_phys_group)

            df_phys = df_phys.append(df_phys_temp.sort_index())

Can you clarify how I can transition to using concat within a loop as above?

mfcss
  • 1,039
  • 1
  • 9
  • 25
  • "how I can transition to using concat within a loop" - don't use it inside the loop. Use it once, after the loop. – user2357112 Sep 07 '22 at 08:49

2 Answers2

1

The approach would be to make a list of dataframes and then concat them all in one step after the for loop. Here a runable example:

import pandas as pd
import numpy as np

df_list = []
for _ in range(5):
    df_list.append(pd.DataFrame(np.random.randint(0, 100, size=(2, 2)), columns=['A', 'B']))

df = pd.concat(df_list, ignore_index=True)

In your example you would do this once for the outer and once for the inner for loop.

As described in the pandas Documentation, this improves performance in comparison to doing lots of appends (or lots of concats).

Simon
  • 495
  • 1
  • 4
  • 18
0

In principle, the following code snippet should work to fix the warning:

        df_phys = pd.DataFrame()
        for db in self.db_list:
            df_decoder = can_decoder.DataFrameDecoder(db)

            df_phys_temp = pd.DataFrame()
            for length, group in df_raw.groupby("DataLength"):
                df_phys_group = df_decoder.decode_frame(group)
                df_phys_temp = pd.concat([df_phys_temp, df_phys_group], ignore_index=True)

            df_phys = pd.concat([df_phys, df_phys_temp.sort_index()], ignore_index=True)
mgross
  • 550
  • 1
  • 7
  • 24