0

I know that df.info() (and other functions) give me the dtype of each column. However, pandas store a lot of things (e.g. String) as object, and I also see that some columns having NAN/True/False shown as object except as boolean.

How do I get the real data type of each column instead of a large bunch of objects?

BTW, this question is close to what I ask, but none of the answers helps:

Is there a way check if an object is actually a string to use .str accessor without running into AttributeError?

accdias
  • 5,160
  • 3
  • 19
  • 31
  • 1
    Is this question what you are looking for? [Python pandas: how to obtain the datatypes of objects in a mixed-datatype column?](https://stackoverflow.com/q/64195782/1609514) – Bill Aug 31 '22 at 18:22
  • 2
    I think the issue with the NAN/True/False is that there is not a single datatype in the column. NAN is not a bool, and so the column does not contain all values of a single data-type. – scotscotmcc Aug 31 '22 at 18:23
  • if your column has mixed datatypes, the datatype of that column is represented as an object datatype. Even if you are mixing `float` and `int` values in the same column, the resultant datatype will be an `object` datatype. – BeNiza Aug 31 '22 at 18:44
  • 1
    the *real* dtype *is* `object`. You are asking about the specific Python type. Generally, you use the `type` built-in for that, e.g `print(type(whatever)`, except here, you need to apply it to every item in your column – juanpa.arrivillaga Aug 31 '22 at 20:52
  • I had a similar question, and found that something like `df[f].dropna().map(lambda v: type(v).__name__).mode()[0]` gives me the most-common python type of non-null elements. where `df[f].dtype` gives `object` the python type will give things like `str`, `Decimal`, `ndarray` etc – patricksurry Jul 21 '23 at 16:23

0 Answers0