0

*Im editing the df given it contained a typo in ne1_id

having a really hard time trying to solve the following, ill really much appreciate any assistance or light with the following I have a DataFrame df that looks like this:

timestamp user_id ne1_id. ne2_id. attempt_no
0 18:11:42.838363 1 100 1
1 18:11:42.838364 100 123456
2 18:11:42.838365 100 123456
3 18:11:42.83836 100 123456
4 18:11:45.838365 1 100 2
5 18:11:45.838366 100 321234
6 18:11:45.838369 100 321234
7 18:11:46.838363 3 12 3
8 18:11:46.838364 12 9832
9 18:11:47.838363 2 12 4
10 18:11:47.838369 100

What I want to do is to fill the attempt_no of the empty cells (empties are empties not NaN) for the next rows based on timestamp (or index) with the proper attempt_no by associating user_id, ne1_id, ne2_id associations, I im not seeing the logic of it neither the way of do it.

the result should be something like this

timestamp user_id ne1_id. ne2_id. attempt_no
0 18:11:42.838363 1 100 1
1 18:11:42.838364 100 123456 1
2 18:11:42.838365 100 123456
3 18:11:42.838369 100 123456
4 18:11:45.838365 1 100 2
5 18:11:45.838366 100 321234 2
6 18:11:45.838369 100 321234
7 18:11:46.838363 3 12 3
8 18:11:46.838364 12 9832 3
9 18:11:47.838363 2 12 4
10 18:11:47.838369 100 4

something that says the following: "find all the rows where there is a user_id and find the next row with the same ne1_id with an empty user_id and attemp_no and fill atppemp_no with the attemp_no of the previous row" i tried with groupby -that i believe is the way of do it-, but kind of stuck there

i appreciate any suggestion.

jpbrunori
  • 1
  • 1
  • df.attempt_no.mask(df.attempt_no.eq('')).fillna(method='ffill')?? – Nk03 May 30 '21 at 15:53
  • 1
    You haven't defined how attempts are associated. Currently it looks like forward fill attempt_no and reset the index. It's also unclear if those are spaces or NaN in the columns. Please provide your dataframe as a _copyable_ piece of code. See [MRE - Minimal, Reproducible, Example](https://stackoverflow.com/help/minimal-reproducible-example), and [How to make good reproducible pandas examples](https://stackoverflow.com/q/20109391/15497888) for more information. – Henry Ecker May 30 '21 at 16:45
  • @HenryEcker thanks, i just edited with that context. those are just spaces, not NaN. Either way i hand shared the proper ne_id2 (network element#2), that eventually would be needed to associate all the columns with the proper attemp_no that i need. – jpbrunori May 30 '21 at 23:28
  • @Nk03 thanks pal, that was exactly what i was looking for – jpbrunori May 31 '21 at 00:38

1 Answers1

0
def f(x):
    last = None
    for i in range(len(x)):
        if np.isnan(x[i]):
            x[i] = last
        else:
            last = x[i]
    return x

df = pd.DataFrame({'x': [1, None, None, 2, None, None, None, 3, None]})
df[['x']].apply(f)

By applying the function on axis=0 you are able to jointly process the entire column.

rudolfovic
  • 3,163
  • 2
  • 14
  • 38