import pandas
import numpy
list_of_lines = [['a', 'b', 'c', 'd'],
['a', 'b', 'c', 'd'],
['a', 'b', 'c', 'd'],
['a', 'b', 'c', 'd']]
df1 = pandas.DataFrame(list_of_lines)
df1.ix[2, 0] = numpy.NaN
df1.ix[2, 1] = numpy.NaN
Is there a way to get the first non-NaN value from each row?
The way I am currently doing it is:
df1 = df1.bfill(axis = 1)
my_answer = df1.iloc[:, 1]
This gives me:
'a'
'a'
'c'
'a'
But it is very slow.
Is there some trick or native way of doing this without having to bfill? My dataframe is 1,500 columns x 10,000 rows and it takes a while to bfill.
There is a similar question here but that answer is slower than my current method.