2

I have 2 data frames with 25 columns. I am trying to get the distributions for each column in both data frames, for a comparative study.

I do something like this:

count1=df1[col].value_counts().reset_index()
count2=df2[col].value_counts().reset_index()
merged=count1.merge(count2,how='outer',on='index')

Some columns have a list instead of string. I want to convert them to string and then do the above steps.

df1[col+'_str']=df1[col].str.join(' ') 
df2[col+'_str']=df2[col].str.join(' ') 

Now, the problem is that I don't know which columns will have list. Is there a way to find if the contents of a column has list/string?

I tried this:

if((type(df1[col].iloc[0])=='list' )):

But, some of those columns without a value in 0th row, will bypass this test!

How can I find out the type of contents in a dataframe column?

I referred to this SO question, but couldn't use much: SO question

IanS
  • 15,771
  • 9
  • 60
  • 84
pnv
  • 1,437
  • 3
  • 23
  • 52

3 Answers3

4

You can test the first 10 values (for instance) like this:

df1[col].head(10).apply(lambda v: isinstance(v, list)).any()

This will be true if any value in the first 10 is a list.

IanS
  • 15,771
  • 9
  • 60
  • 84
3

you can select the columns with dtype object (strings, lists, ...)

df_obj = df.select_dtypes(include=[object])

and then try something like:

def myfunction(value):
    if isinstance(value, list):
        return ' '.join(value)
    else:
        return value

df_str = df_obj.apply(myfunction)
Maarten Fabré
  • 6,938
  • 1
  • 17
  • 36
  • This is a nice flexible solution, but it would be slower if you know in advance that columns are of a single type, i.e. string or list. – IanS Sep 28 '17 at 08:37
0

If you want to know if any of the values from the column has a list, you can use the any method on the boolean series returned by the is_list_like function

from pandas.api.types import is_list_like

df[column].apply(is_list_like).any()

Will return True if any of the values in the column is a list

Helton Wernik
  • 185
  • 2
  • 3