1

Suppose I have the following pandas series:

s = pd.Series([1,'1',1.0])
s.dtype
dtype('O')

When I check the types of each element I get the following:

type(s.iloc[0])
<class 'int'>

type(s.iloc[1])
<class 'str'>

type(s.iloc[2])
<class 'float'>

Is there a way to slice the pandas series based on the type of the elements?

Something like the following:

mask = s.types == 'str' # this doesn't exist

s[mask]
1    1
dtype: object

Ideally, I would want something that doesn't use loops (such as apply)

Bruno Mello
  • 4,448
  • 1
  • 9
  • 39

2 Answers2

2

We can use Series.map + type:

s[s.map(type).eq(str)]

I think this should be faster since the function is simpler

ansev
  • 30,322
  • 5
  • 17
  • 31
  • Does map use loops? – Bruno Mello Apr 13 '20 at 21:01
  • Yes, but it really is necessary here because there is no pandas method to do this – ansev Apr 13 '20 at 21:08
  • check it https://stackoverflow.com/questions/43191832/checking-if-a-data-series-is-strings – ansev Apr 13 '20 at 21:15
  • this would use loops only to map getting the type of each yield in the series. And then check using `boolean indexing` with `Series.eq`, I think this is better because it avoids checking inside the loop – ansev Apr 13 '20 at 21:19
1

You can use a lambda function through Series.apply in combination with isintance:

s.loc[s.apply(lambda x: isinstance(x, str))]
jfaccioni
  • 7,099
  • 1
  • 9
  • 25
  • Sorry, didn't mention it in the question but I was searching for something vectorized, because if I have a lot of rows that would be slow – Bruno Mello Apr 13 '20 at 20:59
  • I don't think there's any out-of-the-box pandas method for this. Most will still apply a for loop in the background. You could always use numpy's [vectorize](https://docs.scipy.org/doc/numpy/reference/generated/numpy.vectorize.html) in order to make it more readable (`vec_isinstance = np.vectorize(isinstance); s.loc[vec_isinstance(s, str)]`, but even that is implemented as a for loop. – jfaccioni Apr 13 '20 at 21:18