0

I have a problem of namespace when trying to use function patsy.dmatrices() with the reticulate R package.

Here is a simple reproducible example:

patsy <- import("patsy")
# Data
dataset <- data.frame(Y=rnorm(1000,2.5,1))
# Null model
formula_null <- "I(Y-1) ~ 1"
dmat = patsy$dmatrices(formula_null, data=dataset, NA_action="drop",
                                         return_type="dataframe")

I get the following error:

Error in py_call_impl(callable, dots$args, dots$keywords) : 
AttributeError: 'NoneType' object has no attribute 'f_locals'

I think this is associated to the namespace (cf. Namespace issues when calling patsy within a function) which might be fixed by using the eval_env argument of function dmatrices() but I wasn't able to figure out how.

This is quite problematic when we want to use in R the Python statsmodels package which make uses of the patsy package for formulas.

Thanks for your help,

1 Answers1

1

I'm not sure, but I think your guess about namespaces is correct, and this is an unfortunate interaction between patsy and reticulate. By default, patsy tries to look into the caller's scope to evaluate any unrecognized functions/variables in the formula (just like R formula functions do). This requires using Python's stack introspection to peek at the caller's scope. But since the caller is in a different language entirely, this almost certainly isn't going to work.

It is possible to override patsy's normal behavior of reading the caller's namespace, using the eval_env argument to dmatrices. (Docs.) Try this:

dmat = patsy$dmatrices(formula_null, data=dataset, NA_action="drop",
                       return_type="dataframe",
                       # New:
                       eval_env=patsy$EvalEnvironment(c())
                       )

The idea is that here we create an empty EvalEnvironment object, and tell patsy to use that instead of trying to read the caller's environment.

I'm not familiar with reticulate, so you might need to tweak the above to work – in Python the equivalent would be:

dmat = patsy.dmatrices(formula_null, data=dataset, NA_action="drop",
                       return_type="dataframe",
                       eval_env=patsy.EvalEnvironment([])

In particular, if reticulate doesn't convert c() into an empty list, then you'll want to find something that does. (Maybe try patsy$EvalEnvironment(list())?)

Nathaniel J. Smith
  • 11,613
  • 4
  • 41
  • 49