0

I want to merge the dataset into a 1432 rows x 4 columns data frame. After I used for loop function to filter the all data, the output was separated into 4 outputs, each 1432 rows x 1 column. However, I want them to merge into one table. How I can merge them?

My code and its output:

for ind,row in gf.iterrows():
    filter2 = savgol_filter(row, 31,3)
    hf = pd.DataFrame(filter2)
    hf.to_numpy()
  
    print(hf)

Output:

             0
0     0.901141
1     0.915138
2     0.928173
3     0.940281
4     0.951494
...        ...
1427  0.108484
1428  0.111043
1429  0.113958
1430  0.117230
1431  0.120859

[1432 rows x 1 columns]
             0
0     0.926357
1     0.940313
2     0.953292
3     0.965326
4     0.976451
...        ...
1427  0.108484
1428  0.111043
1429  0.113958
1430  0.117230
1431  0.120859

[1432 rows x 1 columns]
             0
0     0.926577
1     0.941009
2     0.954399
3     0.966784
4     0.978202
...        ...
1427  0.108484
1428  0.111043
1429  0.113958
1430  0.117230
1431  0.120859

[1432 rows x 1 columns]
             0
0     0.928050
1     0.942212
2     0.955387
3     0.967608
4     0.978907
...        ...
1427  0.108484
1428  0.111043
1429  0.113958
1430  0.117230
1431  0.120859
OCa
  • 298
  • 2
  • 13
TCC
  • 13
  • 5
  • @OCa Hi, the output of dataframe is similar with array. There are 4 dataframe sets separately. – TCC Jul 25 '23 at 12:13
  • @OCa 1432 rows x 1 column with 4 sets in gf but I want them to 1 set like 1432 rows x 4 columns in 1 set. – TCC Jul 25 '23 at 12:57
  • @OCa I coded following your suggestion, the out put shows 1432 rows x 4columns but they are 4 sets and each set has the same values in each columns. For filter2, I have the noisy data 4 sets at the first my dataframe is 4rowsx1432 columns. I need to smooth it by using savgol_filter for loop so it can smooth 4 samples in one time. Could you have any suggestion to merge them in one set? – TCC Jul 25 '23 at 13:33
  • @OCa I coded like this, now it can merge the all columns into noe dataset. However, It is still has 4 datasets with the same value. But it is more promising. for ind,row in gf.iterrows(): y=pd.concat([pd.DataFrame(savgol_filter(row, 31, 3)) for (ind, row) in gf.iterrows()],axis=1) display(y) – TCC Jul 25 '23 at 13:57
  • When I remove the loop the error displayed like this y=pd.concat([pd.DataFrame(savgol_filter(row, 31, 3)) for (ind, row) in gf.iterrows()],axis=1) ^ IndentationError: unexpected indent – TCC Jul 25 '23 at 14:01
  • I am rather new too and made a mistake, maybe. Posting what you have tried in the *question* may be preferred. Not a problem I expect, but sorry about that. – OCa Jul 25 '23 at 14:13
  • Also for page readability, we are also invited to remove outdated comments. – OCa Jul 25 '23 at 14:34

1 Answers1

0

In absence of knowledge of what savgol_filter() does, reformulating your for loop into a comprehension may be the best guess:

hf = pd.concat([pd.DataFrame(savgol_filter(row, 31, 3)) for (ind, row) in gf.iterrows()], axis=1)

Alternatively:

hf = pd.concat([pd.DataFrame(savgol_filter(df.iloc[ix], 31, 3)) for ix in gf.index], axis=1)

Enforcing index ind to land as column name:

pd.concat([pd.DataFrame(columns = [ind],
                        data = row) for (ind, row) in df.iterrows()], axis=1)

Explanation

There are several flaws in your loop design:

  • Your for loop assigns a new data frame to the same variable at every iteration and this does not look like something you want to do. Instead, the comprehension processes all rows as a list, which lets you concatenate them as one object in the end.
  • .to_numpy does not look useful, since you request a data frame as final product. It also is not assigned to a variable so it is without purpose a in the loop.

Depending on the nature of your savgol_filter function, simpler syntaxes may be possible.

OCa
  • 298
  • 2
  • 13