Importing a CSV with grouped data into a Pandas data frame

Question

When I import my data file with Pandas I get following data frame:

    product feature_1   feature_2
0   a   11  12
1   NaN 13  14
2   NaN 15  16
3   NaN 17  18
4   NaN 19  20
5   b   21  22
6   NaN 23  24
7   NaN 25  26
8   c   27  28
9   NaN 29  30
10  NaN 31  32

What I need to do is to substitute the NaNs with the next non-NaN element above them so I get following data frame:

    product feature_1   feature_2
0   a   11  12
1   a   13  14
2   a   15  16
3   a   17  18
4   a   19  20
5   b   21  22
6   b   23  24
7   b   25  26
8   c   27  28
9   c   29  30
10  c   31  32

What I did (see gist for code and datafile):

Import my data into a list of dicts
iterate through the list and make the modifications
import the list into a data frame

How can I make this happen directly in Pandas without doing the list preprocessing beforehand ?

You can just do `df['product'] = df['product'].ffill()`, however, if you want to get back to the grouped df, you can pass the ordinal position of the multi-index: `pd.read_csv(your_file_path, index_col=[0,1])`. So are you wanting to get back a multi-index df? — EdChum, Aug 09 '18 at 09:56

score 2 · Accepted Answer · answered Aug 09 '18 at 09:56

2

You can use pd.Series.ffill to avoid dictionary conversion and manual iteration:

df['product'].ffill(inplace=True)

print(df)

   product  feature_1  feature_2
0        a         11         12
1        a         13         14
2        a         15         16
3        a         17         18
4        a         19         20
5        b         21         22
6        b         23         24
7        b         25         26
8        c         27         28
9        c         29         30
10       c         31         32

answered Aug 09 '18 at 09:56

jpp

159,742
34
281
339

The OP may actually want to get back to the multi-index df, which I commented on so we'll see if this is the case. If so we should find a different dupe – EdChum Aug 09 '18 at 09:59
@EdChum, Got it, feel free to reopen if that's the case :) – jpp Aug 09 '18 at 09:59

Importing a CSV with grouped data into a Pandas data frame

1 Answers1