I have a large CSV file, consisting of measurements, with over 75000 rows and over 200 columns. Each column has a locationId, a type of measurement (type A or B) and columns corresponding to the time and date it was measured with hourly intervals:
locationId type 2021-02-09 22 ... 2021-03-11 05 2021-03-11 06
0 60 A NaN ... NaN NaN
1 42 B NaN ... NaN NaN
2 58 A NaN ... NaN NaN
3 67 A NaN ... NaN NaN
4 53 B NaN ... NaN NaN
Most of the date entries in each row have NaN, but at least 1 column in each row has a valid numeric entry. I need to extract the first value that is NOT NaN in each row and present the result as:
locationId type 2021-0X-XX XX
0 60 A 2.45
1 42 B 5.32
2 58 A 2.33
3 67 A 2.21
4 53 B 3.93
I've tried using Dataframe manipulation with iloc and loc, but these are mostly for manipulating values in same columns. I'm fairly new to python, someone told me dataframes would be a place to start, but this might be wrong I dont know..?My end result is to export a csv file with only 1 date column like above.