3

In my new CRAN package I have 10 dataframes that have 10 or so columns each of various types in the data/ folder. The types are strings, int, floats, booleans, etc.

I need to add roxygen2 documentation for each of these data sources. Is there a method that autogenerates comment blocks given a data.frame?

Something like: makeDocs(games)

#' games
#'  title character
#'  score integer
#'  value numeric
#'  ...

I worry if I do it by hand I could make mistakes (~100 columns) or constantly re-edit things by hand if names change.

I found this great answer about documenting datasets How can I document data sets with roxygen?

... but that does not address how I can autogenerate these comments?

Breck
  • 2,075
  • 1
  • 13
  • 10
  • `gsub("^", "#' ", capture.output(str(iris[0,])))` could be a start (or not `[0,]`, showing some sample data, over to you) – r2evans Jun 29 '18 at 00:19

3 Answers3

4

You will most likely find the functions provided by sinew R-package useful; see the examples provided in a R-bloggers post in here.

The following examples work for data frames as well as functions, and create roxygen skeletons for them. You'll naturally need to modify some fields manually, as indicated by the capital letters:

> set.seed(1); dat <- data.frame(first = LETTERS[1:10], second = rnorm(10), third = 1:10)
> fun <- function(x, y) { x + y }
> sinew::makeOxygen(dat)
#' @title DATASET_TITLE
#' @description DATASET_DESCRIPTION
#' @format A data frame with 10 rows and 3 variables:
#' \describe{
#'   \item{\code{first}}{character COLUMN_DESCRIPTION}
#'   \item{\code{second}}{double COLUMN_DESCRIPTION}
#'   \item{\code{third}}{integer COLUMN_DESCRIPTION} 
#'}
#' @details DETAILS
"dat"
> sinew::makeOxygen(fun)
#' @title FUNCTION_TITLE
#' @description FUNCTION_DESCRIPTION
#' @param x PARAM_DESCRIPTION
#' @param y PARAM_DESCRIPTION
#' @return OUTPUT_DESCRIPTION
#' @details DETAILS
#' @examples 
#' \dontrun{
#' if(interactive()){
#'  #EXAMPLE1
#'  }
#' }
#' @rdname fun
#' @export

As you can see, sinew produces #'-lines that are compatible with generating roxygenized .Rd files when placed in the appropriate locations in the .R-files. See further functions in the package that can place these lines automatically to correct locations.

Teemu Daniel Laajala
  • 2,316
  • 1
  • 26
  • 37
3

Start with a list of the frames' names, then something like this is a quick hack:

frames <- c("iris","mtcars")
unlist(sapply(frames, function(d) c(paste("#'", d), "#' @format data.frame",
                                    gsub("^","#'",capture.output(str(get(d)))),
                                    dQuote(d)),
              simplify=FALSE), use.names=FALSE)
#  [1] "#' iris"                                                                                    
#  [2] "#' @format data.frame"                                                                      
#  [3] "#''data.frame':\t150 obs. of  5 variables:"                                                  
#  [4] "#' $ Sepal.Length: num  5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ..."                            
#  [5] "#' $ Sepal.Width : num  3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ..."                          
#  [6] "#' $ Petal.Length: num  1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ..."                        
#  [7] "#' $ Petal.Width : num  0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ..."                        
#  [8] "#' $ Species     : Factor w/ 3 levels \"setosa\",\"versicolor\",..: 1 1 1 1 1 1 1 1 1 1 ..."
#  [9] "\"iris\""                                                                                   
# [10] "#' mtcars"                                                                                  
# [11] "#' @format data.frame"                                                                      
# [12] "#''data.frame':\t32 obs. of  11 variables:"                                                  
# [13] "#' $ mpg : num  21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ..."                          
# [14] "#' $ cyl : num  6 6 4 6 8 6 8 4 4 6 ..."                                                    
# [15] "#' $ disp: num  160 160 108 258 360 ..."                                                    
# [16] "#' $ hp  : num  110 110 93 110 175 105 245 62 95 123 ..."                                   
# [17] "#' $ drat: num  3.9 3.9 3.85 3.08 3.15 2.76 3.21 3.69 3.92 3.92 ..."                        
# [18] "#' $ wt  : num  2.62 2.88 2.32 3.21 3.44 ..."                                               
# [19] "#' $ qsec: num  16.5 17 18.6 19.4 17 ..."                                                   
# [20] "#' $ vs  : num  0 0 1 1 0 1 0 1 1 1 ..."                                                    
# [21] "#' $ am  : num  1 1 1 0 0 0 0 0 0 0 ..."                                                    
# [22] "#' $ gear: num  4 4 4 3 3 3 3 4 4 4 ..."                                                    
# [23] "#' $ carb: num  4 4 1 1 2 1 4 2 2 4 ..."                                                    
# [24] "\"mtcars\""                                                                                 

Then you can cat it out to a file and have most of what you need.

r2evans
  • 141,215
  • 6
  • 77
  • 149
  • Thanks @r2evans! This worked great. I made some modifications which I'll turn into an answer below. – Breck Jun 30 '18 at 01:37
2

I took r2evans code from his answer above and turned it into a function.

makeDoc = function (dataFrame, title = substitute(dataFrame)) {
      output = c(paste("#'", title), "#' @format data.frame", gsub("^","#'",capture.output(str(dataFrame))), dQuote(title))
      cat(output, sep="\n")
    }
Breck
  • 2,075
  • 1
  • 13
  • 10