Extract first element that is NOT NaN in each row of csv file

Question

I have a large CSV file, consisting of measurements, with over 75000 rows and over 200 columns. Each column has a locationId, a type of measurement (type A or B) and columns corresponding to the time and date it was measured with hourly intervals:

   locationId  type   2021-02-09 22  ...  2021-03-11 05  2021-03-11 06
0      60       A         NaN        ...        NaN          NaN
1      42       B         NaN        ...        NaN          NaN
2      58       A         NaN        ...        NaN          NaN
3      67       A         NaN        ...        NaN          NaN
4      53       B         NaN        ...        NaN          NaN

Most of the date entries in each row have NaN, but at least 1 column in each row has a valid numeric entry. I need to extract the first value that is NOT NaN in each row and present the result as:

   locationId  type   2021-0X-XX XX  
0      60       A         2.45        
1      42       B         5.32       
2      58       A         2.33        
3      67       A         2.21        
4      53       B         3.93

I've tried using Dataframe manipulation with iloc and loc, but these are mostly for manipulating values in same columns. I'm fairly new to python, someone told me dataframes would be a place to start, but this might be wrong I dont know..?My end result is to export a csv file with only 1 date column like above.

`df.set_index(['locationID','type']).stack()` should do the trick. or `pd.melt(df,id_vars=['locationId','type']).dropna()` — Umar.H, Mar 21 '21 at 22:52
Also look at [get-first-non-null-value-per-row](https://stackoverflow.com/questions/50004529/get-first-non-null-value-per-row) — Joe Ferndz, Mar 21 '21 at 23:32

Extract first element that is NOT NaN in each row of csv file

0 Answers0