1

I have a DataFrame given like this:

                     a     b      c       d
2014-02-10 23:30:00 25.1  NaN    NaN     NaN
2014-02-10 23:30:00 NaN   15.34  NaN     NaN
2014-02-10 23:30:00 NaN   NaN    123.54  NaN
2014-02-10 23:30:00 NaN   NaN    NaN     1.34

where for one time step I've got 4 values - one value per column. All other are NaN.

Is it possible to drop NaN values and leave only 4 values per one time step? To have something like this:

                     a     b      c       d
2014-02-10 23:30:00 25.1  15.34  123.54  1.34

I've tried applying solution from Remove NaN 'Cells' given by @unutbu, but without any success:

import numpy as np
import pandas as pd
import functools

def drop_and_roll(col, na_position='last', fillvalue=np.nan):
    result = np.full(len(col), fillvalue, dtype=col.dtype)
    mask = col.notnull()
    N = mask.sum()
    if na_position == 'last':
        result[:N] = col.loc[mask]
    elif na_position == 'first':
        result[-N:] = col.loc[mask]
    else:
        raise ValueError('na_position {!r} unrecognized'.format(na_position))
    return result

df = pd.read_table('data', sep='\s{2,}')

print(df.apply(functools.partial(drop_and_roll, fillvalue='')))
Community
  • 1
  • 1
Michal
  • 1,927
  • 5
  • 21
  • 27

1 Answers1

2

You could just groupby the index and call sum:

In [70]:
df.groupby(df.index).sum()

Out[70]:
                        a      b       c     d
2014-02-10 23:30:00  25.1  15.34  123.54  1.34
EdChum
  • 376,765
  • 198
  • 813
  • 562