9

I want to check if a column in a dataframe contains strings. I would have thought this could be done just by checking dtype, but that isn't the case. A pandas series that contains strings just has dtype 'object', which is also used for other data structures (like lists):

df = pd.DataFrame({'a': [1,2,3], 'b': ['Hello', '1', '2'], 'c': [[1],[2],[3]]})

df = pd.DataFrame({'a': [1,2,3], 'b': ['Hello', '1', '2'], 'c': [[1],[2],[3]]})
print(df['a'].dtype)
print(df['b'].dtype)
print(df['c'].dtype)

Produces:

int64
object
object

Is there some way of checking if a column contains only strings?

Kewl
  • 3,327
  • 5
  • 26
  • 45

2 Answers2

19

You can use this to see if all elements in a column are strings

df.applymap(type).eq(str).all()

a    False
b     True
c    False
dtype: bool

To just check if any are strings

df.applymap(type).eq(str).any()
piRSquared
  • 285,575
  • 57
  • 475
  • 624
2

You could map the data with a function that converts all the elements to True or False if they are equal to str-type or not, then just check if the list contains any False elements

The example below tests a list containing element other then str. It will tell you True if data of other type is present

test = [1, 2, '3']
False in map((lambda x: type(x) == str), test)

Output: True

David Bern
  • 778
  • 6
  • 16