2

I am attempting to manually translate some R code into Python and encountered this snippet:

"drm" <- function(
formula, curveid, pmodels, weights, data = NULL, subset, fct,
type = c("continuous", "binomial", "Poisson", "quantal", "event"), bcVal = NULL, bcAdd = 0,
start, na.action = na.omit, robust = "mean", logDose = NULL,
control = drmc(), lowerl = NULL, upperl = NULL, separate = FALSE,
pshifts = NULL)
{
    ## ... elided ...

    ## Storing call details
    callDetail <- match.call()

    ## Handling the 'formula', 'curveid' and 'data' arguments
    anName <- deparse(substitute(curveid))  # storing name for later use
    if (length(anName) > 1) {anName <- anName[1]}  # to circumvent the behaviour of 'substitute' in do.call("multdrc", ...)
    if (nchar(anName) < 1) {anName <- "1"}  # in case only one curve is analysed


    mf <- match.call(expand.dots = FALSE)
    nmf <- names(mf)
    mnmf <- match(c("formula", "curveid", "data", "subset", "na.action", "weights"), nmf, 0)

    mf[[1]] <- as.name("model.frame")
    mf <- eval(mf[c(1,mnmf)], parent.frame())  #, globalenv())
    mt <- attr(mf, "terms")

    dose <- model.matrix(mt, mf)[,-c(1)]  # with no intercept
    resp <- model.response(mf, "numeric")

    origDose <- dose
    origResp <- resp  # in case of transformation of the response
    lenData <- length(resp)
    numObs <- length(resp)

    xDim <- ncol(as.matrix(dose))
    varNames <- names(mf)[c(2, 1)]
    varNames0 <- names(mf)

    # only used once, but mf is overwritten later on

    ## Retrieving weights
    wVec <- model.weights(mf)
    if (is.null(wVec))
    {
        wVec <- rep(1, numObs)
    }

    ## Finding indices for missing values
    missingIndices <- attr(mf, "na.action")
    if (is.null(missingIndices)) {removeMI <- function(x){x}} else {removeMI <- function(x){x[-missingIndices,]}}

    ## Handling "curveid" argument
    assayNo <- model.extract(mf, "curveid")
    if (is.null(assayNo))  # in case not supplied
    {
        assayNo <- rep(1, numObs)
    }
    uniqueNames <- unique(assayNo)
    colOrder <- order(uniqueNames)
    uniqueNames <- as.character(uniqueNames)
    # ...
}

What is this doing? I see in the documentation for match.call() that

match.call returns a call in which all of the specified arguments are specified by their full names.

But I don't understand what this means. What is "a call" in this context? What does it mean that "arguments are specified by their full names"?

Ultimately, the important part is what is stored in dose and resp. These variables are used later so I need an understanding of what their values are so I can do something similar in Python (potentially with numpy, pandas, and scipy).

Code-Apprentice
  • 81,660
  • 23
  • 145
  • 268
  • @joran I don't know enough about R to verbalize what I don't understand about `match.call()`. I've read the docs, but I don't understand what it means. So in short, I don't understand anything about `match.call` to even have a starting point. – Code-Apprentice Oct 04 '19 at 17:14
  • @joran I've edited the question to try to highlight the pieces that I think are most important to figure out here so I can translate this `drm()` function into Python. – Code-Apprentice Oct 04 '19 at 17:18
  • Do you know what a call is? If not, I suggest a quick read of the R language definition. You can also just return callDetail and check the output. – Roland Oct 05 '19 at 07:48
  • @Roland Not in this context, no. – Code-Apprentice Oct 05 '19 at 14:45
  • 1
    @joran So if I understand correctly, a "function call" here is a data structure which represents a particular invocation of a function. Is that correct? – Code-Apprentice Oct 07 '19 at 19:22
  • Related answer (for R users): [Why is `match.call()` useful?](https://stackoverflow.com/questions/32486753/why-is-match-call-useful) But really it would be better to restate your question as [What is the Python equivalent of R's `match.call()`?](https://stackoverflow.com/questions/32486753/why-is-match-call-useful), to which I believe the answer is ***(function) introspection***. And you still have to give the missing context: why do you think you need to do this in Python? Testing? Debugging a class you're writing? – smci Nov 12 '19 at 04:24
  • "why do you think you need to do this in Python?" At the time I wrote this question, I was translating some R code into Python. My approach was to keep as close to the R implementation in order to maintain the same output. My previous attempts to reimplement this in a more pythonic way resulted in different output for some of my example inputs. Because of the business requirements that wasn't acceptable. – Code-Apprentice Nov 12 '19 at 16:33
  • But that's stale information; I spent hours [researching and answering your second question with a comment](https://stackoverflow.com/questions/58227989/discrepancy-between-4-parameter-log-logistic-non-linear-regression-in-python-and). *"Possibly related: [Discrepancies between R optim vs Scipy optimize: Nelder-Mead](https://stackoverflow.com/questions/54985793/discrepancies-between-r-optim-vs-scipy-optimize-nelder-mead)"*. Then we had a [day-long discussion giving you advice about how to make the port, and unit-testing](https://chat.stackoverflow.com/transcript/message/47491013#47491013) – smci Nov 13 '19 at 20:13
  • So as of late October, the answer to *"why do you think you need to do this [port R's `match.call` syntax, literally] in Python?"* is "I realized I don't". The short answer to porting is *"focus on writing and debugging some unit-tests, not literally porting syntaxes that aren't needed in or native to the target language"*. I think we resolved all this back in October, my answer here is an attempt to summarize our discussion of last month, can you finally close this and accept the answer it was helpful? (It's bad form to have multiple stale re-askings of the same underlying question across SO) – smci Nov 13 '19 at 20:19
  • @smci "I think we resolved all this back in October..." Yes, I have already moved on from this and am working on other things. – Code-Apprentice Nov 13 '19 at 20:34
  • Code-Apprentice: then the standard practice is close the question and accept an answer (or else self-answer and accept your own self-answer). Especially since I spent >12 hours researching answering your particular question, even though the premise was shaky (that you needed to port match.call to Python, or that Python (rather than Nelder-Mead implementation) was responsible for the numerical differences you were getting in your port). – smci Nov 13 '19 at 21:36

1 Answers1

1

The literal R answer is here. But your question intent seems to be What is the idiomatic Python equivalent of R's match.call(), and when should I/not use it?, to which the answer is:

  • (function) introspection with inspect.signature(f) 1 , 2 : inspect which function arguments got matched by positional association vs keyword/named association (vs defaults). In a Python function/method signature, func(arg_1, *args, **kwargs) is the rough equivalent of R's ellipsis ... in f(args, ...) to pass-through unspecified args (often inherited from super().func()).
    • Never use introspection in production code
    • There might well also be scoping issues with R's scoping behaving differently to Python. In general it's less grief in Python to create a class (a custom subclass, if you need to) and encapsulate the object's data, to avoid trouble).
  • But why do you think you need to port the match.call() line to Python, at all? Other than in unit-testing, or debugging a class you're writing, you generally don't do this in Python. If you're porting drc::drm() for your own use, then the standard advice is implement the absolute minimum interface you need for your own purposes (and not release-quality and you're not getting paid for this), and ignore all the bells and whistles. It might well take you longer to figure out what the R match.call() line is doing than ignoring it or kludging it for your use-case.
  • The Pythonic way to implement a heavily overloaded function prototype is to default all non-essential parameters to None, then any arg-parsing logic to give them "smart default" values (depending on what other args were/weren't passed, or an object's state) has to go inside the function body. This works, and Python users should understand what your resulting code does.

As to whether you should even be using drc as a reference package in the first place, the same advice I gave you a month ago, drc package has not had a CRAN release since 2016, is essentially dormant, only has one or two maintainers, no mailing-list, and isn't that well-documented. There may well be other R packages with better code or better documentation to use as a reference. I can barely spell 'bioassay', so I suggest you ask on relevant lists/ user groups (both Python and R, academic and commercial) for recommendations for which reference package to start from.

(Obviously if you really want to contribute R doc and unit-tests to drm maintainers, as well as doing the Python port, you could offer. But it sounds like too much grief if you only want a basic Python equivalent.)

(This asking of your question is very broad. I also tried to address your second much more specific reasking with a comment. I don't know if that supersedes this, please update via edits/comments.)

smci
  • 32,567
  • 20
  • 113
  • 146
  • Thanks for the links to other questions about match.call(). I'll refer back to those when I'm back on this issue. As for using drc as a reference package, it is used in the current implementation of a system I am working on. At the time I wrote this question (and the other one you refer to as well), I was tasked with replacing drc with a python implementation. The existing system, including its use of drc, is the reference I have to follow. I need to maintain the behavior of the existing system. Other R packages aren't helpful because I'm moving away from R all together. – Code-Apprentice Nov 12 '19 at 16:41
  • Your first two links point at the same question. Is that intended? – Code-Apprentice Nov 12 '19 at 16:45