1

I have a Panda dataframe such as this:

col     value
0          1.28
1           4
2           9.34
3           13
4           15
5           23
6           35

When I do df.info() I get that value is object however when I test this as follows:

A = df['value'].reset_index().applymap(lambda x: isinstance(x, (float)))
A[A['value']==False].shape[0]

I get zero rows. Also some models throw an error because of data type thing, then what I would do is to force the data type to be float.

A['value'] = A['value'].astype('float')

Can some one explain how you can investigate why data type is not float and what is wrong with my code for detecting the data type is float or not?

user59419
  • 893
  • 8
  • 20
  • `df['value']` is int64 not float; `df['col']` is float because of the `0.` value as your code snippet would show. It is not clear what your problem actually is. – user19077881 May 18 '23 at 16:24
  • Your example if not reproducible, I get all the rows, please provide the DataFrame constructor – mozway May 18 '23 at 17:11
  • @user19077881, I updated the post the value column has combination of fraction and integer values. – user59419 May 18 '23 at 19:11
  • @user19077881, sometimes for some models it expects your data to be nuemric/float and I thought the value column is – user59419 May 18 '23 at 19:15
  • That printout doesn't look right. Floats should be shown with a decimal point even if they're equal to integers. Either way, it's not clear how you made an object column out of floats in the first place; normally it'd be converted to Pandas float. So please make a [mre]. You might even find the problem yourself in the process. For specifics, see [How to make good reproducible pandas examples](/q/20109391/4518341). – wjandrea May 18 '23 at 19:31
  • @wjandrea, This is what I am asking, how can you verify this: I thought this code is able A = df['value'].reset_index().applymap(lambda x: isinstance(x, (float))) A[A['value']==False].shape[0]. to tell me what is wrong. I a looking for a procedure to identify the problematic rows in dataframe that cause issues. – user59419 May 18 '23 at 19:55
  • @user59419 Oh, I just realized one thing I don't think you're aware: Pandas "object" dtype can hold any Python object, whether it's mixed str/int/float/whatever or all one type, i.e. float in this case. Normally when you create a column of all floats (or floats and ints for that matter), the column will get converted to a Pandas float dtype, but I guess that didn't happen in this case, like hypothetically, if you accidentally specified `dtype=object`. But without an MRE, we can only guess, and like I said, this example is inconsistent. – wjandrea May 18 '23 at 20:25
  • @wjandrea, I understand, my question, how would you go about detecting what causes this problem? Like is there any filter to apply to say these rows are not float. I thought what I have there is telling if something is not float or not. – user59419 May 18 '23 at 21:19
  • @user59419 Look at how the column is being created – wjandrea May 18 '23 at 21:58
  • Your 'value' column as shown of mixed float and integer values is of type 'float' as df.info() should show. The only way I can see that you get it to be type object is if one or more of the columns values is a string value such as "1.2" or a blank string '"". You need to look at how the DF was formed. – user19077881 May 19 '23 at 10:15

0 Answers0