0

I want to get the fist and last value of each column that is a valid value meaning a integer or a float value.

For example from the code below

    import pandas as pd
    
    #create DataFrame
    df = pd.DataFrame({'T1': [nan, 12, 15, 23, 19, 23, 25, 29, nan, nan, 0, nan, nan, 0],
                       'T2': [nan, nan, 7, 7, 9, 12, 9, 9, nan, 0, nan, nan, nan, nan],
                       'T3': [nan, nan, nan, nan, 11, 8, 10, 6, 6, 5, 9, 12, nan, nan]})
    
    
    #view DataFrame
    df
    
        T1     T2   T3
    0   NaN    NaN  NaN
    1   12     NaN  NaN
    2   15     7    NaN
    3   23     7    NaN
    4   19     9    11
    5   23     12   8
    6   25     9    10
    7   29     9    6
    8   NaN   NaN   6
    9   NaN    0    5
    10  0     NaN   9
    11  NaN   NaN   12
    12  NaN   NaN   NaN
    13  0     NaN   NaN

The output that I wish to get is 
the first and last value of T1 thus - [12,0]
the first and last value of T2 thus - [7,0]
the first and last value of T3 thus - [11,12]

This is just a sample data set, I have a dataframe that contains 6000 rows and I want to find the first and last value of each column wherein I also have NaN as the value. Also I don't know the index of my first value or last.

I have tried

  • df.iloc[-1,0]
  • df['T1'].iloc[0]

And few others from Link1, Link2 but without any success. Also I want to get the first element and not the minimum value.

ThePyGuy
  • 17,779
  • 5
  • 18
  • 45
  • Please explain why the second value in `the first and last value of T1 thus - [12,0]` is `0`. The generated dataframe is totally different than what you have shown in the dataframe. – ThePyGuy Jun 21 '21 at 12:31
  • @Don'tAccept: Thanks for pointing this out, this was just a sample case, I was not deligent in making the exact replication, I just added values for showing example on the go!! Have corrected this now – Himanshu patel Jun 21 '21 at 13:35

3 Answers3

1

I am not sure if this is the most efficient way to do this. But here's a simple one liner using pd.DataFrame.isna() to skip nans

first, last = df.T1[~df.T1.isna()].values[[0, -1]]
Teshan Shanuka J
  • 1,448
  • 2
  • 17
  • 31
0

You can use ~df.isna() to select the columns that are not NaN.

df[~df['T1'].isna()].iloc[0, 0]
df[~df['T1'].isna()].iloc[0, 0]

... et cetera

Niels Henkens
  • 2,553
  • 1
  • 12
  • 27
0

ffill and bfill the values then take only the first and last rows:

result = df.bfill().ffill()[::df.shape[0]-1]

OUTPUT:

      T1   T2    T3
0   12.0  7.0  11.0
13   0.0  0.0  12.0

And, now you can take individual values using iat

result.iat[0,0], result.iat[-1,0]
#output:
(12.0, 0.0)

PS: It's always recommended to use iat to access the values at particular row and column indices.

ThePyGuy
  • 17,779
  • 5
  • 18
  • 45
  • That seems like a nice way around. Just that for this, I dont want to distrub the structure of dataframe as the NaN values are corresponding to another column. So in principle, the T1, T2, T3 are the x values for a Y value column. – Himanshu patel Jun 21 '21 at 13:33
  • 1
    @Himanshupatel, this doesn't actually change the original dataframe. – ThePyGuy Jun 21 '21 at 13:45