7

I'm running into a problem with the MNP package which I've traced to an unfortunate call to deparse (whose maximum width is limited to 500 characters).

Background (easily skippable if you're bored)

Because mnp uses a somewhat idiosyncratic syntax to allow for varying choice sets (you include cbind(choiceA,choiceB,...) in the formula definition), the left hand side of my formula call is 1700 characters or so when model.matrix.default calls deparse on it. Since deparse supports a maximum width.cutoff of 500 characters, the sapply(attr(t, "variables"), deparse, width.cutoff = 500)[-1L] line in model.matrix.default has as its first element:

[1] "cbind(plan1, plan2, plan3, plan4, plan5, plan6, plan7, plan8, plan9, plan10, plan11, plan12, plan13, plan14, plan15, plan16, plan17, plan18, plan19, plan20, plan21, plan22, plan23, plan24, plan25, plan26, plan27, plan28, plan29, plan30, plan31, plan32, plan33, plan34, plan35, plan36, plan37, plan38, plan39, plan40, plan41, plan42, plan43, plan44, plan45, plan46, plan47, plan48, plan49, plan50, plan51, plan52, plan53, plan54, plan55, plan56, plan57, plan58, plan59, plan60, plan61, plan62, plan63, "       
[2] "    plan64, plan65, plan66, plan67, plan68, plan69, plan70, plan71, plan72, plan73, plan74, plan75, plan76, plan77, plan78, plan79, plan80, plan81, plan82, plan83, plan84, plan85, plan86, plan87, plan88, plan89, plan90, plan91, plan92, plan93, plan94, plan95, plan96, plan97, plan98, plan99, plan100, plan101, plan102, plan103, plan104, plan105, plan106, plan107, plan108, plan109, plan110, plan111, plan112, plan113, plan114, plan115, plan116, plan117, plan118, plan119, plan120, plan121, plan122, plan123, "
[3] "    plan124, plan125, plan126, plan127, plan128, plan129, plan130, plan131, plan132, plan133, plan134, plan135, plan136, plan137, plan138, plan139, plan140, plan141, plan142, plan143, plan144, plan145, plan146, plan147, plan148, plan149, plan150, plan151, plan152, plan153, plan154, plan155, plan156, plan157, plan158, plan159, plan160, plan161, plan162, plan163, plan164, plan165, plan166, plan167, plan168, plan169, plan170, plan171, plan172, plan173, plan174, plan175, plan176, plan177, plan178, plan179, "
[4] "    plan180, plan181, plan182, plan183, plan184, plan185, plan186, plan187, plan188, plan189, plan190, plan191, plan192, plan193, plan194, plan195, plan196, plan197, plan198, plan199, plan200, plan201, plan202, plan203, plan204, plan205, plan206, plan207, plan208, plan209, plan210, plan211, plan212, plan213, plan214, plan215, plan216, plan217, plan218, plan219, plan220, plan221, plan222, plan223, plan224, plan225, plan226, plan227, plan228, plan229, plan230, plan231, plan232, plan233, plan234, plan235, "
[5] "    plan236, plan237, plan238, plan239, plan240, plan241, plan242, plan243, plan244, plan245, plan246, plan247, plan248, plan249, plan250, plan251, plan252, plan253, plan254, plan255, plan256, plan257, plan258, plan259, plan260, plan261, plan262, plan263, plan264, plan265, plan266, plan267, plan268, plan269, plan270, plan271, plan272, plan273, plan274, plan275, plan276, plan277, plan278, plan279, plan280, plan281, plan282, plan283, plan284, plan285, plan286, plan287, plan288, plan289, plan290, plan291, "
[6] "    plan292, plan293, plan294, plan295, plan296, plan297, plan298, plan299, plan300, plan301, plan302, plan303, plan304, plan305, plan306, plan307, plan308, plan309, plan310, plan311, plan312, plan313)"  

When model.matrix.default tests this against the variables in the data.frame, it returns an error.

The problem

To get around this, I've written a new deparse function:

deparse <- function (expr, width.cutoff = 60L, backtick = mode(expr) %in% 
  c("call", "expression", "(", "function"), control = c("keepInteger", 
                                                        "showAttributes", "keepNA"), nlines = -1L)  {
    ret <- .Internal(deparse(expr, width.cutoff, backtick, .deparseOpts(control), nlines))
    paste0(ret,collapse="")
  }

However, when I run mnp again and step through, it returns the same error for the same reason (base::deparse is being run, not my deparse).

This is somewhat surprising to me, as what I expect is more typified by this example, where the user-defined function temporarily over-writes the base function:

> print <- function() {
+   cat("user-defined print ran\n")
+ }
> print()
user-defined print ran

I realize the right way to solve this problem is to rewrite model.matrix.default, but as a tool for debugging I'm curious how to force it to use my deparse and why the anticipated (by me) behavior is not happening here.

Ari B. Friedman
  • 71,271
  • 35
  • 175
  • 235

2 Answers2

4

The functions fixInNamespace and assignInNamespace are provided to allow editing of existing functions. You could try ... but I will not since mucking with deparse looks too dangerous:

 assignInNamespace("deparse", 
                  function (expr, width.cutoff = 60L, backtick = mode(expr) %in% 
               c("call", "expression", "(", "function"), control = c("keepInteger", 
               "showAttributes", "keepNA"), nlines = -1L)  {
    ret <- .Internal(deparse(expr, width.cutoff, backtick, .deparseOpts(control), nlines))
    paste0(ret,collapse="")
                         }   , "base")

There is an indication on the help page that the use of such functions has restrictions and I would not be surprised that such core function might have additional layers of protection. Since it works via side-effect, you should not need to assign the result.

IRTFM
  • 258,963
  • 21
  • 364
  • 487
  • Another caution is given in `?fixInNamespace`: "They [the documented functions] should not be used in production code." – Joshua Ulrich May 20 '12 at 16:12
  • I got the idea from @gsk3's efforts that this was for "experimental" purposes. – IRTFM May 20 '12 at 16:13
  • Thanks, and warnings duly noted. I'll likely change the function that calls `deparse` rather than `deparse` itself. – Ari B. Friedman May 20 '12 at 16:13
  • It remains possible that the 500 character limit is put in place because of the design of `.Internal(deparse(.))` – IRTFM May 20 '12 at 16:15
  • @Dwin Oh, I suspect that it is. That's why my code takes the output of `.Internal(deparse(.))` and fixes it after the fact. Suspicions about further protections were correct. I can neither fix `deparse` nor `model.matrix.default` which calls it. – Ari B. Friedman May 20 '12 at 16:22
3

This is how packages with namespaces search for functions, as described in Section 1.6, Package Namespaces of Writing R Extensions

Namespaces are sealed once they are loaded. Sealing means that imports and exports cannot be changed and that internal variable bindings cannot be changed. Sealing allows a simpler implementation strategy for the namespace mechanism. Sealing also allows code analysis and compilation tools to accurately identify the definition corresponding to a global variable reference in a function body.

The namespace controls the search strategy for variables used by functions in the package. If not found locally, R searches the package namespace first, then the imports, then the base namespace and then the normal search path.

Joshua Ulrich
  • 173,410
  • 32
  • 338
  • 418
  • I figured something like that was going on, so you've pinpointed the problem. Now how to override? :-) – Ari B. Friedman May 20 '12 at 15:58
  • @gsk3: You're likely to break something by overriding `deparse`, so I would suggest you find the exact location of `deparse` call you want to change, insert your function right before that call, and restore the original `deparse` immediately after. You might be able to use `fixInNamespace` to change the function that contains the `deparse` call you want to change. – Joshua Ulrich May 20 '12 at 16:10