21

I have been working on a few R packages for some general tools that aren't currently available in R: blogging, report delivery, logging, and scheduling. This led me to wonder: what are the most important things that people wish existed in R that currently aren't available?

My hope is that we can use this to pinpoint some gaps, and possibly work on them collaboratively.

jogo
  • 12,469
  • 11
  • 37
  • 42
Shane
  • 98,550
  • 35
  • 224
  • 217
  • 5
    Rebuilding an operating system in R, eh? – Dirk Eddelbuettel Nov 10 '09 at 16:02
  • 14
    Is my last name Campos? Dirk, we all know that if you're going to rebuild an operating system, you should develop your own language first (preferably from assembly) and then build it using your own syntax. – Shane Nov 10 '09 at 16:13
  • 10
    I think I have an existential problem with this question. An R package that doesn't exist cannot be useful. – Nosredna Nov 10 '09 at 23:23
  • 4
    You have to use your imagination. But to be clear, I'm not interested in the larger set of *useless* nonexistent packages. – Shane Nov 11 '09 at 01:27

12 Answers12

16

I'm a former Mathematica junkie, and one thing that I really miss is the notebook style interface. When I did my research with notebooks, papers would almost write themselves as I did my analysis. But now that I'm using R, I find that documenting my work to be quite tedious.

For people that are not so familiar with Mathematica, you have documents called "notebooks" that can contain code, text, equations, and the results from executed code (which can be equations, text, graphics, or interactive tools). Everything can be neatly organized into styled subsections or sections that are collapsable. You can have multiple open documents that integrate with a single shared kernel.

While I don't think a full-blown Mathematica style interface is entirely necessary, some interactive document system that would support text (for description), code, code output, and embedded image output would be a real boon to researchers.

eytan
  • 5,945
  • 3
  • 20
  • 11
  • 2
    I think that's a great idea. I was thinking about this last night myself. Matlab has the same functionality (you can run Matlab code within the documentation). This would be a big project. Just thinking about implementation, it might be best to do this with javascript on the existing html framework that they're developing. So any section marked code could be highlighted and executed. – Shane Nov 15 '09 at 03:51
  • 1
    The SAGE project (http://www.sagemath.org) already has something like this implemented-- R is one of many backends that are supported. There is always room for improvement though! – Sharpie Nov 17 '09 at 02:24
  • I like this idea, and with packages like `brew` only the user interface is really missing to achieve that functionality. Perhaps most easily implemented in a web browser like Sage, or with R-studio? – baptiste Sep 04 '11 at 20:50
  • To follow on @Sharpie comment: Check http://www.sagenb.org/ it's an online notebook – Maxim Veksler Oct 27 '11 at 11:52
  • 4
    RStudio / knitR do this, for anyone landing on this question! – Andrew Oct 23 '13 at 18:15
12

A Real-Time R package would be my choice, using C Streaming perhaps.

Also I'd like a more robust web development package. Nothing as extensive as Ruby on Rails but something a bit better than Sweave combined with R2HTML, that can run on RApache. I think this needs to be a huge area of emphasis for R in general.

I realize LaTeX is better markup for certain academia but in general I think HTML should be the markup language of choice. More needs to be done in terms of R Web Apps, so applications can be hosted on huge RAM remotely and R can start being used for SaaS data applications and other graphics choices.

Dan
  • 6,008
  • 7
  • 40
  • 41
10

Interfaces to any of the new-fangled 'Web 2.0' databases that use key-value pairs rather than the standard RDMS. A non-exhaustive list (in alphabetical order) would be

and it would of course be nice if we had a DBI-alike abstraction on top of this. Jeff has started with RBerkeley but that use the older-school Oracle BerkeleyDB backend rather than one of those new things.

Dirk Eddelbuettel
  • 360,940
  • 56
  • 644
  • 725
8

An output device which produces Javascript code, perhaps using the protovis library.

Karsten W.
  • 17,826
  • 11
  • 69
  • 103
5

as a programmer and writer of libraries for colleagues, I was definitely missing a logging package, I googled and asked around, here too, then wrote one myself. it is on r-forge, here, and it s called "logging" :)

I use it and I'm obviously still developing it.

Community
  • 1
  • 1
mariotomo
  • 9,438
  • 8
  • 47
  • 66
4

A natural interface to the .NET framework would be awesome, though I suspect that that might be a lot of work.

EDIT: Syntax highlighting from within RGui would also be wonderful.

ANOTHER EDIT: R.NET now exists to integrate R with .NET.

Richie Cotton
  • 118,240
  • 47
  • 247
  • 360
  • Not soo sure about the .Net idea -- that is too platform-centric whereas one of the strengths of the R system (in the large sense) is how the underlying platform is mostly abstracted away. – Dirk Eddelbuettel Nov 10 '09 at 16:28
  • @Dirk: It would really need to work with Mono as well to be cross-platform and fit with the open source nature of R. – Richie Cotton Nov 10 '09 at 16:32
  • The Mac RGui has highlighting, and JGR does as well. – Ian Fellows Nov 10 '09 at 16:35
  • I can see how this would be really *useful* (regardless of whether it's cross-platform). – Shane Nov 10 '09 at 16:35
  • 3
    Romain Francois has a source level highlight project based on the R parser: http://r-forge.r-project.org/projects/highlight/ – Dirk Eddelbuettel Nov 10 '09 at 16:40
  • Somebody should get to work on an improved TextMate bundle for R, the current one is alright, but it would be appreciated. Doesn't help none Mac people, sorry. – Dan Nov 10 '09 at 23:23
4

There are few libraries to interface with database in general, and there is not ORM library.

RMySQL is useful, but you have to write the SQL queries manually and there is not a way to generate them as in a ORM. Morevoer, it is only specific to MySQL.

Another library set that R still doesn't have, for me, it is a good system for reading command line arguments: there is R getopt but it is nothing like, for example, argparse in python.

dalloliogm
  • 8,718
  • 6
  • 45
  • 55
3

A FRAQ package for FRequently Asked Questions, a la fortune(). R-help would be so much fun: "Try this, library(FRAQ); faq("lattice won't print"), etc.

See also.

baptiste
  • 75,767
  • 19
  • 198
  • 294
  • also, each package could define its own list of faq entries. Given the similarity with fortunes it might be worth considering a meta-package that encompasses the general concept. – baptiste Sep 05 '11 at 03:21
3

A wiki package that adds wiki-like documentation to R packages. You'd have a inst/wiki subdirectory with plain text files in markdown, asciidoc, textile, with embedded R code. With the right incantation, these files would be executed (think brew and/or asciidoc packages), and the relevant output uploaded to a given repository online (github, googlecode, etc.). Another function could take care of synchronizing the changes made online, typically via svn or git.

Suddenly you have a wiki documentation for your package with reproducible examples (could even be hooked to R CMD check).

EDIT 2012:

... and now the knitr package would make this process even easier and neater

baptiste
  • 75,767
  • 19
  • 198
  • 294
  • This would be FANTASTIC if for no other reason than having a statistical wiki moderated by the R community could be far better than the statistical entries on Wikipedia. I assume entries could be moderated by package maintainers. – Iterator Sep 05 '11 at 00:32
  • Thanks for the update - `knitr` looks interesting. – Iterator Jan 18 '12 at 18:04
2

I would like to see a possibility to embed another programming language within R in a more straightforward way by the users. I give this as an example in some common-lisp implementations one could write a function with embedded C code like this:

(defun sample (x)
  (ffi:c-inline (n1 n2) (:int :int) (values :int :int) "{
    int n1 = #0, n2 = #1, out1 = 0, out2 = 1;
    while (n1 <= n2) {
      out1 += n1;
      out2 *= n1;
      n1++;
    }
    @(return 0)= out1;
    @(return 1)= out2;
    }"
   :side-effects nil))

It would be good if one could write an R function with embedded C or lisp code (more interested in the latter) in a similar way.

A5C1D2H2I1M1N2O1R2T1
  • 190,393
  • 28
  • 405
  • 485
francogrex
  • 465
  • 1
  • 4
  • 17
  • Is it cfunction {inline}? It supports c, cpp, c++, f, f95, objc, objcpp, objc++. It's quite nice. Good to add Lisp to the list. – francogrex Jul 21 '10 at 18:03
1

A native .NET interface to RGUI. R(D)Com is based on COM, and it only allows to exchange matrices, not more complex structures.

Nestor
  • 13,706
  • 11
  • 78
  • 119
1

I would very much like a line profiler. This exists in Matlab and Python, and is very useful for finding bits of code that take a lot of time or are executed more (or less) than expected. A lot of my code involves function optimizations and how many times something iterates may not be known in advance (though most iterations are constrained or specified).

The call stack is useful if all of your code is in R and is very simple, but as I recently posted about it, it takes a painstaking effort if your code is complex.

It's quite easy to develop a line profiler for a given bit of code. A naive way is to index every line (or just pre-specified sections) and insert a call to log proc.time() that line. In a loop, I simply enumerate sections of code and store in a 2 dimensional list the proc.time values for section i in iteration k. [See update below: this isn't actually a way to do a line profiler for all kinds of code.]

One can use such a tool to find hotspots, anomalies (e.g. code that should be O(n) but is really O(n^2)), code that may benefit from memoization (a line profiler doesn't tell you this, but it lets you know where to look), code that is mistakenly inside a loop, and more.

Update 1: Inserting a timing line between every function line is slightly erroneous: the definition of a line of code is not simply code separated by whitespace. Being able to parse the code into an AST is necessary for knowing where operations begin and end. As discussed in some of the answers to this question, there are some tools (namely, showTree and walkCode in the codetools package) for doing this. Simply applying a regular expression to source code would be a very bad thing to do.

Community
  • 1
  • 1
Iterator
  • 20,250
  • 12
  • 75
  • 111