0

I'm pulling large sets of data for multiple sites and views from Google Analytics for processing in R. To streamline the process, I've added my most common queries to a function (so I only have to pass the profile ID and date range). Each query is stored as a local variable in the function, and is assigned to a dynamically-named global variable:

R version 3.1.1
library(rga)
library(stargazer)

# I would add a dataset, but my data is locked down by client agreements and I don't currently have any test sites configured.
profiles <- ga$getProfiles()
website1 <- profiles[1,]
start <- "2013-01-01"
end <- "2013-12-31"


# profiles are objects containing all the ID's, accounts #, etc.; start and end specify date range as strings (e.g. "2014-01-01")
reporting <- function(profile, start, end){ 

    id <- profile[,1] #sets profile number from profile object

    #rga function for building and submitting query to API
    general <- ga$getData(id,
                          start.date = start,
                          end.date = end,
                          metrics = "ga:sessions")

    ... #additional queries, structured similarly to example above(e.g. countries, cities, etc.)

    #transforms name of profile object to string
    profileName <- deparse(substitute(profile))

    #appends "Data" to profile object name
    temp <- paste(profileName, "Data", sep="")

    #stores query results as list
    temp2 <- list(general,countries,cities,devices,sources,keywords,pages,events)

    #assigns list of query results and stores it globally
    assign(temp, temp2, envir=.GlobalEnv)
}

#call reporting function at head of report or relevant section
reporting(website1,start,end)
#returns list of data frames returned by the ga$getData(...), but within the list they are named "data.frame" instead of their original query name.

#generate simple summary table with stargazer package for display within the report
stargazer(website1[1])

I'm able to access these results through *website1*Data[1], but I'm handing the data off to collaborators. Ideally, they should be able to access the data by name (e.g. *website1*Data$countries).

Is there an easier/better way to store these results, and to make accessing them easier from within an .Rmd report?

mattpolicastro
  • 377
  • 1
  • 5
  • 16
  • 1
    Any time someone uses the phrase "assigns to a global variable in R", it almost always means they don't know what is happening. This is _clearly_ not a complete example and we will not know how to help you until you post one. What is "profile ID" and what makes you think `deparse(substitute(profile))` executed at the top level of evaluation will meaning anything? – IRTFM Aug 12 '14 at 03:18
  • hate to be the nay-sayer, but you do not want to do this the way you currently have it above – Ricardo Saporta Aug 12 '14 at 03:47
  • @BondedDust `deparse...` was a workaround I found elsewhere—I'm happy to hear alternatives. I've updated the code example and added some comments to outline some of my thinking. Your time and effort are greatly appreciated. – mattpolicastro Aug 12 '14 at 14:29
  • @BondedDust Also, for whatever it's worth: my original intent was to have multiple sets of data for several sites available for generating tables and other visualisations inside an .Rmd report. I'm definitely a novice, and arrived at the scheme above after some brief sketching. – mattpolicastro Aug 12 '14 at 14:32
  • There are some good candidates for today's prize for unhelpful comments... How's about pointing the OP at ["how to make a great R reproducible question"](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example?rq=1) and *then* telling him he's doing it all wrong, but showing him how to fix it with a constructive answer? – Andy Clifton Aug 12 '14 at 14:43
  • Within a function `deparse(substitute(.))` does make sense. But you still fail to offer enough information to tell what package(s) are being used or a test website for experimentation. Even trying sos::findFn("getData") brings up too many hits to narrow it down. – IRTFM Aug 12 '14 at 14:44
  • @AndyClifton thanks for the link. I'll be sure to follow the examples in the future. – mattpolicastro Aug 12 '14 at 14:57
  • You can edit your question now to include an example of how this function would work, exactly what's not working well, and how you've tried to fix it. You could also edit your code so that the comments are on the lines above, not next to the code - this would help readability and might get you some answers. – Andy Clifton Aug 12 '14 at 15:01
  • This is still not a complete question or an example we can understand. How would I actually _use_ this function? I'm not seeing a call to `reporting()` anywhere in your example. And there must be an example within the documentation of the `rga` package you can leverage to help explain things to us. The nice thing about writing a clear question is often it helps you answer the question yourself. – Andy Clifton Aug 12 '14 at 15:17
  • 1
    @AndyClifton I've tried to flesh out the example a bit more. My questions are these: 1. Is the way I'm storing these results (a list of data frames, globally) overly problematic? 2. If not, is there a straightforward way to access one of those data frames by name, rather than by index position? – mattpolicastro Aug 12 '14 at 15:43

1 Answers1

0

There's no real reason to do the deparse in side the function just to assign a variable in the parent environment. If you have to call the reporting() function, just have that function return a value and assign the results

reporting <- function(profile, start, end){ 
    #... all the other code

    #return results
    list(general=general,countries=countries,cities=cities,
        devices=devices,sources=sources,keywords=keywords,
        pages=pages,events=events)
}

#store results
websiteResults <- reporting(website1,start,end)
MrFlick
  • 195,160
  • 17
  • 277
  • 295
  • I added the `deparse/assign` chunk to cut down on user error (we've got several sites with extremely similar naming schemes). However, I tried this method and it's still returning the data frames without names. Could I use the `deparse(substitute())` trick similarly? – mattpolicastro Aug 12 '14 at 16:11
  • If you just wanted to name the data.frames, then name them in the list as i've done in my updated code. The `deparse/assign` combo is just bad style. It's not the way R functions work and would be very confusing to anyone with any R experience. – MrFlick Aug 12 '14 at 16:27
  • If `profiles` has bunch of websites, You can easily map over it with `mapply(reporting, profiles[,1], MoreArgs=list(start,end))` (depending on the structure of the actual profiles object which isn't clear from your code). – MrFlick Aug 12 '14 at 16:27
  • `profiles` is a data frame of each profile available under the Google Analytics user being used for the API call; each row represents a profile and its ID, Account ID, display name, etc. I don't know `apply` or its brethren yet, so I'll be sure to look into them. Regardless, your suggestion is working like a charm. Thanks. – mattpolicastro Aug 12 '14 at 16:57