0

I want to replace values in Column1 in each Group with the value in Column1 where Column0 in that row begins with Date and Time :

df:

    Column0         Column1         Column2         Column3        Group 
1   Date and Time : 10/01/17,0900   NaN             NaN            A    * value to replace with
2   NaN             NaN             NaN             NaN            A
3   Name            NaN             NaN             NaN            A
                                    ...

1   NaN             NaN             02/06/17,1030   NaN            B
2   Date and Time : 02/06/17,1000   NaN             NaN            B    * value to replace with
3   Details         05/07/17,1330   NaN             NaN            B
4   NaN             01/08/17,1400   Date and Time : NaN            B
                                    ...

Expected output:

    Column0         Column1         Column2         Column3        Group 
1   Date and Time : 10/01/17,0900   NaN             NaN            A
2   NaN             10/01/17,0900   NaN             NaN            A
3   Name            10/01/17,0900   NaN             NaN            A
                                    ...

1   NaN             02/06/17,1000   02/06/17,1030   NaN            B
2   Date and Time : 02/06/17,1000   NaN             NaN            B
3   Details         02/06/17,1000   NaN             NaN            B
4   NaN             02/06/17,1000   Date and Time : NaN            B
                                    ...
nilsinelabore
  • 4,143
  • 17
  • 65
  • 122

1 Answers1

3

Select rows by condition Series.str.startswith first in boolean indexing, create Series by Group with select Column1 and last use Series.map:

s = (df[df['Column0'].str.startswith('Date and Time', na=False)]
       .set_index('Group')['Column1'])
df['Column1'] = df['Group'].map(s)
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
  • 1
    Thanks jezrael, do you mind briefly explain your answer? Also can we please select the row where values "begins with" `Date and Time :` instead of `eq`(because there might be some noise)? – nilsinelabore Jul 09 '21 at 07:17
  • 1
    @nilsinelabore - You are right, I overlooked it. – jezrael Jul 09 '21 at 07:18
  • By `create Series` do you mean `.set_index('Group')`? Do you know any articles or learning resources that explain this technique? – nilsinelabore Jul 09 '21 at 07:35
  • 1
    @nilsinelabore - It means I need mapping by `Group`, so converted `Group` column to index and for `Series` is selected `Column1` - so `s` is Series possible use for mapping by originl column `Group`. This technique is used also if need matching by another DataFrame, like mentioned in [this](https://stackoverflow.com/q/53010406) answer. – jezrael Jul 09 '21 at 07:38
  • So strange I got error with `df[df['Column0'].str.startswith('Date and Time')]`: `ValueError: Cannot mask with non-boolean array containing NA / NaN values`. Any idea ? FYI, I just upgraded to Pandas 1.3.0. – SeaBean Jul 09 '21 at 07:42
  • 2
    @SeaBean - Ya, then need `na=False` parameter – jezrael Jul 09 '21 at 07:43