I'm pulling data from the Google Analytics API, processing it locally, then knitting an .Rmd file into text, tables, and visualisations. As part of the knitting/tabling process, I'm doing some basic formatting (e.g. rounding off percentages and adding % signs).
For this question, I have toPercent()
, which works fine if used like this:
toPercent <- function(percentData){
percentData <- round(data, 2)
percentData <- mapply(toString, percentData)
percentData <- paste(percentData, "%", sep="")
}
devices <- toPercent(devices$avgSessionDuration)
However, manually setting the function for every table is time-intensive. I created the percentCheck()
to look for columns that matched my criteria:
percentCheck <- function(data){
data[,grep("rate|percent", names(data), ignore.case=TRUE)] <- toPercent(data[,grep("rate|percent", names(data), ignore.case=TRUE)])
}
devices <- percentCheck(devices)
But I know this doesn't work on a dataset with multiple matches (e.g. a column for exitRate
and a column for bounceRate
).
Q1: Have I written toPercent()
in a way that won't return multiple values to one entry?
Q2: How can I structure percentCheck()
to map over the dataset and only apply toPercent()
if the column name includes a given string?
Version/Packages:
R version 3.1.1 (2014-07-10) -- "Sock it to Me"
library(rga)
library(knitr)
library(stargazer)
Data:
> dput(devices)
structure(list(deviceCategory = c("desktop", "mobile", "tablet"
), sessions = c(817, 38, 1540), avgSessionDuration = c(153.424888853179,
101.942758538617, 110.270988142292), bounceRate = c(39.0192297391397,
50.2915625371891, 50.1343873517787), exitRate = c(25.3257456030279,
32.0236280487805, 29.0991902834008)), .Names = c("deviceCategory",
"sessions", "avgSessionDuration", "bounceRate", "exitRate"), row.names = c(NA,
-3L), class = "data.frame")