0

My apologies if this question has been asked elsewhere; I couldn't find an answer.

I have a pandas series (that is, not a DataFrame) and I want to get the first n values.

  1. Define the series

    import numpy as np
    import pandas as pd
    
    my_s = pd.Series(['a', 'b', 'c', np.nan, 'f', 'l', 'd', 'a', 'a', np.nan])
    my_s
    
    #> 0      a
    #> 1      b
    #> 2      c
    #> 3    NaN
    #> 4      f
    #> 5      l
    #> 6      d
    #> 7      a
    #> 8      a
    #> 9    NaN
    #> dtype: object
    
  2. Set n

    n = 5
    
  3. Slice

    3.1 With [:n]

    my_s[:n]
    #> 0      a
    #> 1      b
    #> 2      c
    #> 3    NaN
    #> 4      f
    #> dtype: object
    

    3.2 With .loc[:n]

    my_s.loc[:n]
    #> 0      a
    #> 1      b
    #> 2      c
    #> 3    NaN
    #> 4      f
    #> 5      l <~~~~~~~~~ this one is included here but wasn't above
    #> dtype: object
    

How come 3.1 and 3.2 return different results? I googled it but could not find any relevant discussion of this.

Emman
  • 3,695
  • 2
  • 20
  • 44

1 Answers1

0

need different example for explain

Example

s1 = pd.Series([a, b, c,  d, e], index=list('ABCDE'))

s1

A    a
B    b
C    c
D    d
E    e
dtype: int64

[ ] is location slicing on series:

s1[:3]

A    a
B    b
C    c
dtype: int64

loc is label slicing:

s1.loc[:'D']

A    a
B    b
C    c
D    d
dtype: int64

location slicing does not include right border

s1[3]
output: d

s1[:3] does not include d

Panda Kim
  • 6,246
  • 2
  • 12
  • Thanks. I kinda understand but not fully. In my example, `my_s[:n]` is slicing based on what? I don't have numbers in the values. I have letters. – Emman Nov 20 '22 at 08:46
  • I think you're misunderstanding so I re-edited my answer for you. You need to know what `location` of index is. value doesn't matter – Panda Kim Nov 20 '22 at 08:57