3
import pandas
import numpy

list_of_lines = [['a', 'b', 'c', 'd'],
                 ['a', 'b', 'c', 'd'],
                 ['a', 'b', 'c', 'd'],
                 ['a', 'b', 'c', 'd']]

df1 = pandas.DataFrame(list_of_lines)

df1.ix[2, 0] = numpy.NaN
df1.ix[2, 1] = numpy.NaN

Is there a way to get the first non-NaN value from each row?

The way I am currently doing it is:

df1 = df1.bfill(axis = 1)

my_answer = df1.iloc[:, 1]

This gives me:

'a'
'a'
'c'
'a'

But it is very slow.

Is there some trick or native way of doing this without having to bfill? My dataframe is 1,500 columns x 10,000 rows and it takes a while to bfill.

There is a similar question here but that answer is slower than my current method.

user1367204
  • 4,549
  • 10
  • 49
  • 78

0 Answers0