Need to fetch column name based on first non NaN value, and return that column name in a new column

Question

I have 12 columns, each labeled 1-12 that are representing months. Some columns have a data reading, and others are blank (nan). I need to have a new column that displays the first month with a reading. I also need another column that displays the last column with a reading.

Right now I have tried: df['df_initial_month] = first_valid_index() In the very first column I have "ID's" that I would like to skip. Right now when I run the code it just displays the first ID in every single row of the new column

I have also tried using df.ffill(axis=1).iloc[:,0]

From the pandas tag's wiki on this site: https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples — Paul H, Jul 13 '20 at 19:13
I think you need `groupby.agg` but _please_ provide some sample data see [ask] and [mcve] — Umar.H, Jul 13 '20 at 19:37

score 0 · Answer 1 · answered Jul 13 '20 at 19:57

0

Define column_list to be a list containing columns names of the columns corresponding to the 12 months and then try the following: df['df_initial_month'] = df[column_list].apply(pd.DataFrame.first_valid_index, axis=1) .

answered Jul 13 '20 at 19:57

Varsha Kishore

161
5

I tried: column_list = ['1', '2','3','4','5','6','7','8','9','10','11','12'] df['bdf_initial_month'] = df[column_list].apply(pd.DataFrame.first_valid_index, axis=1) But I end up getting the error: "None of [Index(['1', '2', '3', '4', '5', '6', '7', '8', '9', '10', '11', '12'], dtype='object', name='df_month')] are in the [columns]" – Nick Vandegriffe Jul 13 '20 at 20:11
Are you sure that the names of the columns are ['1', '2','3','4','5','6','7','8','9','10','11','12']? You can check by running `list(df.columns)`. – Varsha Kishore Jul 13 '20 at 20:36
Yes. I got the first_valid_index to work for the column! I used: df = df.set_index([1]) and then ran the df.apply(pd.DataFrame...) However, I am not able to get the last month of a reading because I have a bunch of columns on the end, and was not able to specify the column range – Nick Vandegriffe Jul 13 '20 at 21:21
What's the output of `list(df.columns)`? – Varsha Kishore Jul 13 '20 at 22:09
[1,2,3,4,5..etc.] All of them are not in ' ' though. Don't know if that makes a difference – Nick Vandegriffe Jul 14 '20 at 01:11
It does. Try using column_list = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]. – Varsha Kishore Jul 14 '20 at 01:27

Need to fetch column name based on first non NaN value, and return that column name in a new column

1 Answers1