Coercion To Scalar In data.table Function Argument Accepting Scalar or Vector

Question

I expect the code:

datab <- data.table(Events = c(79,68,54), Duration = c(61,44,72))
datab[, .(Poisson.High=poisson.test(Events, T = Duration)$conf.int[2])]

to produce:

   nrow Poisson.High
1:    1    1.6140585
2:    2    1.9592319
3:    3    0.9785873

Instead it produces:

Error in poisson.test(Events, T = Duration) : 
  the case k > 2 is unimplemented

As I understand it, this is because poisson.test's first argument can accept a vector as well as a scalar. In the vector case it must be two elements and no more. As there are 3 rows in the table the evaluation of the first row fails as it sees a vector with three elements as the first argument to poisson.test.

How can I reference Events in such a way that it provides only the scalar value associated with that row? (I've tried Events[1] but that just uses the first row in datab for obvious reasons.)

If you go through the introductory materials for the package, you'll learn about the `by=` option, which allows `by=1:nrow(datab)` that would probably handle this case. — Frank, Apr 08 '17 at 00:31
Not to be obtuse, but what introduction are you talking about? The `Vignettes:`->`Introduction to data.table` doesn't discuss what a vector of numbers does in the `by=`. The `Reference manual:` has `Introduction to data.table` links that all 404. The `campus.datacamp.com` introduction doesn't cover it either -- at least not up to the paywall. Does `by=1:nrow(datab)` produce as many groups as there are rows with one row in each group? — user3673, Apr 08 '17 at 05:33
The "URL" field on CRAN https://CRAN.R-project.org/package=data.table takes you to the official website, from there, you can see "Getting Started" (which I lump under "introductory"). The `by=` argument is of course documented in `?data.table`, e.g., with `by=.(x=x>0, y)` showing that you can define your own grouping variables instead of just using columns and it will group by each value (in the case of x > 0, those values would be TRUE, FALSE and NA). I'm not sure if you're trying to debate the adequacy of the documentation here or what... — Frank, Apr 08 '17 at 05:45
Btw, for vignettes, they are usually installed along with the package, and you can see a list of those with `vignette(package = "data.table")`. Sometimes these are more up to date than those on the site. — Frank, Apr 08 '17 at 05:48
https://github.com/Rdatatable/data.table/wiki/Getting-started — Uwe, Apr 08 '17 at 09:22
@Frank Thanks for your solution. Please post it as a solution with an elaboration on the semantics of `by=x:y` relevant to the question by answering the question in my prior response to you, which I now repose: "Does by=1:nrow(datab) produce as many groups as there are rows with one row in each group?" — user3673, Apr 08 '17 at 14:18
Sorry, the answer to your quoted question is yes, though `by=x:y` is not relevant -- it's for selecting a set of consecutive columns. I'm not posting an answer because your question is not clear and reproducible, so I cannot test a solution to verify it works. Some guidance if you want to improve it: http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example/28481250#28481250 I get the impression that you think a "solution" here is far more important than reading ?data.table and the other getting-started material, but doing that might save you a lot of trouble later. — Frank, Apr 08 '17 at 14:26
@Frank I enumerated the introductory material that I had looked at not to debate the adequacy of the introductory material but to address the apparent view that the question was somehow frivolous -- said appearance taking two forms: 1) the negative votes on the question and 2) your imputation that the reason I posed the question was that I had not previously gone through "the introductory material". I still don't know why there is a negative opinion on this question. Should the question be posed in a different way? — user3673, Apr 08 '17 at 14:28
OK, having read the link on how to pose R questions, I'll attempt to improve it by adding reproducible code. — user3673, Apr 08 '17 at 14:32
Yes, generally questions are better received if they are [mcve]. That might mean making a new example different from your actual use-case to illustrate the difficulty you're having and the desired output. I mentioned `by=` since you did not show it in the code excerpt, even though to me it seems like the first thing to try if one knows that by= can take user-defined grouping vectors. Regarding whether or not you read the intro material, like I said, it's in ?data.table and `vignette("datatable-intro")` under "Expressions in by" -I mentioned the intro material as a useful pointer...or meant to. — Frank, Apr 08 '17 at 14:33
@Frank, let me know if the question is getting there now that I've included a working example. — user3673, Apr 08 '17 at 17:57
Yeah, looks good to me. Takes a few more votes to get it reopened so an answer can be posted. One other option is `datab[, mapply(function(...) poisson.test(...)$conf.int[2], Events, T = Duration)]`. Also, fyi, you can store the result in a new column, like `datab[, new_col := ...]` where `...` is the code to make it using by or mapply. — Frank, Apr 08 '17 at 18:13

Coercion To Scalar In data.table Function Argument Accepting Scalar or Vector

0 Answers0