Disclaimer:
unlike 99.9% of most out there, I didn't pick up python until very late in the progression of languages I write in. I won't harp on some of the odd behaviors of the import model, but I do find myself having an issue understanding why the type
checking (ie: "what kinda thing is you random object some user has given me hmm?) is all over the place.
Really this is just checking what class of data a thing is, but in python it's never struck me as being straightforward and in my research on the interwebz, well let's just say their are opinions and the only thing anyone agrees on is using the term pythonic
. My question boils down to type(x) == y
vs isinstance(x, y)
when the type
isn't one of the more straightforward list, tuple, float, int, ...
yadda yadda .
Current Conundrum:
I need the ability to determine if an object that is being passed(either directly, or dynamically within a recursive routine) is not just an iterable, but more specifically an object created by scandir
. Please don't get lost in the singular issue, i'll show i have many ways to get to this, but the bigger question is:
A) Is the method I'm using to coerce the output of type()
going to bite me in the backside given a case I am not thinking of?
B) Am I missing a simpler way of accessing the 'class|type' of an object that is language-specific type of thing?
C) TBD
I'll start by showing maybe where the root of my disconnect comes from, and have a little fun with the people I know will take the time to answer this question properly by a first example in R
.
I'm going to cast my own class
attribute just to show what i'm talking about:
> a <- 1:3
> class(a)
[1] "integer"
> attr(a, "class")
[1] "integer"
Ok so, like in python, we can ask if this is an int(eger)
etc. Now I can re-class as I see fit, which is getting to the point of where i'm going with the python issue:
> class(a) <- "i.can.reclass.how.i.want"
> class(a)
[1] "i.can.reclass.how.i.want"
> attr(a, "class")
[1] "i.can.reclass.how.i.want"
So now in python, let's say I have a data.frame
, or as you all put it DataFrame
:
>>> import pandas as pd
>>> df = pd.DataFrame({"a":[1,2,3]})
>>> type(df)
pandas.core.frame.DataFrame
Ok, so if i want to determine if my object is a DataFrame
:
>>> df = pd.DataFrame({"a":[1,2,3]})
# Get the mro of type(df)? and remove 'object' as an item in the mro tuple
>>> isinstance(df, type(df).__mro__[:-1])
True
# hmmmm
>>> isinstance(df, (pandas.core.frame.DataFrame))
NameError: name 'pandas' is not defined
# hmmm.. aight let's try..
>>> isinstance(df, (pd.core.frame.DataFrame))
True
# Lulz... alright then, I guess i get that, but why did __mro__ pass with pandas vs pd? Not the point...
For when you can't do that
# yes..i know.. 3.5+ os.scandir... focus on bigger picture of this question/issue
import scandir
>>> a = scandir.scandir("/home")
>>> type(a)
posix.ScandirIterator
>>> str(type(scandir.scandir("/home")))
"<class 'scandir.ScandirIterator'>"
>>> isinstance(scandir.scandir("/home"), (scandir,scandir.ScandirIterator))
AttributeError: module 'scandir' has no attribute 'ScandirIterator'
# Okay fair enough.. kinda thought it could work like pandas, maybe can but I can't find it?
Question:
Does that mean that my only way of knowing the instance/type
of certain objects like the scandir
example are essentially the below type hacks?
import re
def isinstance_from_type(x, class_info):
_chunk = re.search("(?<=\s['|\"]).*?(?=['|\"])", str(type(x)),re.DOTALL)
try:
return _chunk.group(0) == str(class_info)
except:
return False
>>> a = scandir.scandir("/home")
>>> type(a) == "scandir.ScandirIterator"
False
>>> isinstance_from_type(a, "scandir.ScandirIterator")
True
Okay I get why i don't get a string back from calling type
etc, but please let me know if there's a better, more universal and consistent method i simply don't know, or the hot and dangerous things that are coming using a regex
; trust me.. i get it.
Thanks for reading and any/all feedback about the mechanics of this specific to python are welcomed.