6

Flex your RCurl/XML muscle. Shortest code wins. Parse into R: http://pastebin.com/CDzYXNbG

Data should be:

structure(list(Treatment = structure(c(2L, 2L, 1L, 1L), .Label = c("C", 
"T"), class = "factor"), Gender = c("M", "F", "M", "F"), Response = c(56L, 
58L, 6L, 63L)), .Names = c("Treatment", "Gender", "Response"), row.names = c(NA, 
-4L), class = "data.frame")

Good luck!

Note: dummy data kindly provided by this question: Adding space between bars in ggplot2

Community
  • 1
  • 1
Brandon Bertelsen
  • 43,807
  • 34
  • 160
  • 255

4 Answers4

5

Same idea as kohske but slightly shorter and more clear I think

library(XML)
eval(parse(text=gsub('\r\n','\n',xpathApply(htmlTreeParse('http://pastebin.com/CDzYXNbG',useInternal=T),'//textarea',xmlValue))))
cameron.bracken
  • 1,236
  • 1
  • 9
  • 14
4

RCurl is not necessary for my code, since XML packages can parse URL for file argument.

Please execute

library(XML)

before the examples below.

Code 1 is oneliner:

eval(parse(text=htmlTreeParse("http://pastebin.com/CDzYXNbG",handlers=(function(){qt <- NULL;list(textarea=function(node,...){qt<<-gsub("[\r\n]", "", unclass(node$children$text)$value);node},.qt=function()qt)})())$.qt()))

Code 2 is shorter, but I think this is not shortest.

htmlTreeParse("http://pastebin.com/CDzYXNbG",h=list(textarea=function(n)z<<-gsub("[\r\n]","",unclass(n$c$t)$v)));eval(parse(text=z))

As this question is a kind of game, please decrypt this code.



UPDATED

After looking at @JD Long's excellent solution, here is a shortest code:

eval(parse(file(sub("m/","m/raw.php?i=","http://pastebin.com/CDzYXNbG"))))

Now question is how to make a desired url string in the shortest code ;-p

Updated again. This is shorter by some characters.

source(sub("m/","m/raw.php?i=","http://pastebin.com/CDzYXNbG"))$va
kohske
  • 65,572
  • 8
  • 165
  • 155
4

You guys are making this way too hard:

eval(parse(file("http://pastebin.com/raw.php?i=CDzYXNbG")))

OK, so I cheated. But starting from the same URL you could get the same end:

eval(parse(file(paste("http://pastebin.com/raw.php?i=", strsplit("http://pastebin.com/CDzYXNbG", "/")[[1]][4], sep=""))))

Which still puts me in the lead :)

JD Long
  • 59,675
  • 58
  • 202
  • 294
1

I'm not perfectly sure what you are trying to achieve here, but maybe does what you ask for (not using any fancy packages, just regex):

fullText<-(paste(readLines("http://pastebin.com/CDzYXNbG"), collapse="\n"))
regexp<-"<textarea[^>]*id=\"paste_code\"[^>]*>(.*)</textarea>"
txtarpos<-regexpr(regexp, fullText)
txtarstrt<-txtarpos[1]
txtarlen<-unlist(attributes(txtarpos)["match.length"])
txtarstp<-txtarstrt+txtarlen
txtarpart<-substr(fullText, txtarpos[1], txtarstp)
retval<-gsub("\n", "", gsub("&quot;", "\"", gsub(regexp, "\\1", txtarpart), fixed=TRUE), fixed=TRUE)
cat(retval)

I'm also pretty sure this can be improved upon somewhat, but it does the job I think you asked for. Even if doesn't: thanks for making me want to refresh my regex basics!

Nick Sabbe
  • 11,684
  • 1
  • 43
  • 57
  • `Error: unexpected input in "retval<-gsub("\n", "", gsub(""", "\"", gsub(regexp, "\\1", txtarpart), fixed=TRUE), fixed=TRUE)\"` Interesting use of pure regex! – Brandon Bertelsen May 22 '11 at 17:53