2

I would like to turn data from an HTTP request into a data frame.

The via httr returned data is in the following format, containing meta data such as column headers and type.

Id like to convert this to a corresponding data frame: Columns based on data$columnHeaders - parsed from a defined set of rules (based on data$columnHeaders$dataType or data$columnHeaders$name

It seems like Problem that would have already been solved, however i can't find a proven, fast & efficient solution.

The dput() result of data:

data <- structure(list(columnHeaders = list(structure(list(name = "ga:date", 
    columnType = "DIMENSION", dataType = "STRING"), .Names = c("name", 
"columnType", "dataType")), structure(list(name = "ga:visitors", 
    columnType = "METRIC", dataType = "INTEGER"), .Names = c("name", 
"columnType", "dataType"))), rows = list(c("20120912", "26121"
), c("20120913", "32003"), c("20120914", "38348"), c("20120915", 
"26679"), c("20120916", "26249"), c("20120917", "29867"), c("20120918", 
"31572"), c("20120919", "27576"), c("20120920", "26730"), c("20120921", 
"28598"), c("20120922", "25319"), c("20120923", "27428"), c("20120924", 
"33255"), c("20120925", "32071"), c("20120926", "28272"))), .Names = c("columnHeaders", 
"rows"))
Lukas Grebe
  • 1,844
  • 4
  • 16
  • 17
  • 2
    Can you replace your example data above with the equivalant from `dput(yourData)`? That will let us understand how your data is structured. Other good tips on making a great question [here](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example). Without being able to reproduce your data, I'd recommend something like `newData <- do.call("rbind", yourData$rows)` followed by `names(newData) <- lapply(yourData$colnames, "[")` – Chase Oct 01 '12 at 13:39
  • +1 for @Chase and the point of getting some actual data. – TARehman Oct 01 '12 at 14:36
  • Thank you for the Link @Chase. I've modified the Question as suggested. – Lukas Grebe Oct 01 '12 at 16:11
  • @LukasGrebe - thanks for the updated question. I gave you one solution I think should scale pretty well. – Chase Oct 01 '12 at 17:51

2 Answers2

1

Thanks for the reproducible example. My suggested answer in the comments is more or less what I came up with here:

out <- as.data.frame(do.call("rbind", data[["rows"]]))
names(out) <- make.names(sapply(data[["columnHeaders"]], "[[", 1))


str(out)
#-----
'data.frame':   15 obs. of  2 variables:
 $ ga.date    : Factor w/ 15 levels "20120912","20120913",..: 1 2 3 4 5 6 7 8 9 10 ...
 $ ga.visitors: Factor w/ 15 levels "25319","26121",..: 2 12 15 4 3 10 11 7 5 9 ...
head(out,3)
#-----
   ga.date ga.visitors
1 20120912       26121
2 20120913       32003
3 20120914       38348

Note that I used make.names() to ensure that the column names are valid R names...otherwise you end up with a colon in your column name, which will be problematic downstream.

I'm also going to read between the lines here and assume that your first column is supposed to represent a date and the second a number. You'll notice that R currently thinks both of these are factor variables. Here's how I'd go about turning them into the appropriate data types:

#Date column
out$ga.date <- as.Date(out$ga.date, format = "%Y%m%d")
#Numeric column
out$ga.visitors <- as.numeric(as.character(out$ga.visitors))

str(out)
#-----
'data.frame':   15 obs. of  2 variables:
 $ ga.date    : Date, format: "2012-09-12" "2012-09-13" "2012-09-14" ...
 $ ga.visitors: num  26121 32003 38348 26679 26249 ...

Now I think you've got something useful to do some analysis on. See ?as.Date and ?strptime for details on formatting date and date/time objects.

Chase
  • 67,710
  • 18
  • 144
  • 161
0

I tried to (a) replicate your data, (b) convert the replicated data to a data frame.

#(a) Replicating data 
a<-c("20120912", "26121")
b<-c("20120913", "32003") 
c<-c("20120914", "38348")
data<-rbind(a,b,c)
colnames(data)<-c("date","visitors")

#(b) Converting to data frame
str(data) #chr [1:3, 1:2]
data<-data.frame(data)
str(data) #'data.frame':   3 obs. of  2 variables

Does this answer your question or did I understand you incorrectly? Good luck!

Tyler
  • 1,050
  • 2
  • 14
  • 24
  • Thanks for the quick reply. Step a unfortunately does not represent the data in the form I have it. – Lukas Grebe Oct 01 '12 at 14:39
  • 1
    @LukasGrebe - it's really incumbent on you to provide the structure of your data so that we can reproduce it. Tyler's answer is perfectly valid, just not ideal for your solution. Please heed my advice in my other comment about `dput()` and give us something we can work wwith. – Chase Oct 01 '12 at 15:41
  • Sorry. I've modified the Question as @Chase suggested. – Lukas Grebe Oct 01 '12 at 16:11
  • @LukasGrebe if you use this code: data<-data.frame(data) Does this answer your question? – Tyler Oct 01 '12 at 16:22