1

I'm running this function:

require(XML)
require(plyr)


getKeyStats_xpath <- function(symbol) {
  yahoo.URL <- "http://finance.yahoo.com/q/ks?s="
  html_text <- htmlParse(paste(yahoo.URL, symbol, sep = ""), encoding="UTF-8")

  #search for <td> nodes anywhere that have class 'yfnc_tablehead1'
  nodes <- getNodeSet(html_text, "/*//td[@class='yfnc_tablehead1']")

  if(length(nodes) > 0 ) {
    measures <- sapply(nodes, xmlValue)

    #Clean up the column name
    measures <- gsub(" *[0-9]*:", "", gsub(" \\(.*?\\)[0-9]*:","", measures))   

    #Remove dups
    dups <- which(duplicated(measures))
    #print(dups) 
    for(i in 1:length(dups)) 
      measures[dups[i]] = paste(measures[dups[i]], i, sep=" ")

    #use siblings function to get value
    values <- sapply(nodes, function(x)  xmlValue(getSibling(x)))

    df <- data.frame(t(values))
    colnames(df) <- measures
    return(df)
  } else {
    break
  }
}

As long as the page exists, it works fine. However, if one of my tickers does NOT have any data on that URL, it throws an error:

Error in FUN(X[[3L]], ...) : no loop for break/next, jumping to top level 

I added a trace too, and things break down on ticker number 3.

tickers <- c("QLTI",
"RARE",
"RCPT",
"RDUS",
"REGN",
"RGEN",
"RGLS")

tryCatch({
stats <- ldply(tickers, getKeyStats_xpath)
}, finally={})

I'd like to call the function like this:

stats <- ldply(tickers, getKeyStats_xpath)
rownames(stats) <- tickers
write.csv(t(stats), "FinancialStats_updated.csv",row.names=TRUE)

Basically, if a ticker has no data, I want to skip it.

Can someone please help me get this working?

jaimedash
  • 2,683
  • 17
  • 30
  • 1
    Write a wrapper for `getKeyStats_xpath` that encloses it in `tryCatch`. you could do this within `ldply` with an anonymous function, for example `ldply(tickers, function (t) tryCatch(getKeyStats_xpath(t), finally={}))` – jaimedash Dec 15 '15 at 21:30
  • Possible duplicate of [How to write trycatch in R](http://stackoverflow.com/questions/12193779/how-to-write-trycatch-in-r) – nrussell Dec 15 '15 at 21:30

1 Answers1

2

Expanding on my comment. The issue here is you've enclosed the entire command stats <- ldply(tickers, getKeyStats_xpath) within a tryCatch. This means R will try to get key stats from every ticker.

Instead, what you want is to try each ticker.

To do this, write a wrapper for getKeyStats_xpath that encloses it in tryCatch. you could do this within ldply with an anonymous function, for example ldply(tickers, function (t) tryCatch(getKeyStats_xpath(t), finally={})). Note that finally executes regardless of exit condition, so finally={} executes nothing. (See Advanced R or How to write try catch in R from r-faq for more).

On an error, tryCatch calls the function provided in the argument error. So as is, this code still won't help as the error is unhandled (thanks to rawr for pointing this out earlier). It is also easier to inspect the output if you use llply instead, then

So a complete answer using this approach, and with informative error handling, is below.

stats <- llply(tickers, 
    function(t) tryCatch(getKeyStats_xpath(t), 
        error=function(x) {
            cat("error occurred for:\n", t, "\n...skipping this ticker\n")
        }
    )
)
names(stats) <- tickers
lapply(stats, length)
#<snip>
#$RCPT
#[1] 0
# </snip>

As of now, this works for me, returning data for all tickers except the one listed in the code block above.

Alan
  • 3,153
  • 2
  • 15
  • 11
jaimedash
  • 2,683
  • 17
  • 30
  • Sorry, but I guess I'm missing something here. I tried this: stats <- ldply(tickers, function(t) tryCatch(getKeyStats_xpath(t), error=function(x) {cat("error occurred\n")})) I tried this too: stats <- ldply(tickers, function(t) tryCatch(getKeyStats_xpath(t))) Both attempts ended the same way: Error in getKeyStats_xpath(t) : no loop for break/next, jumping to top level –  Dec 15 '15 at 22:17
  • I tried this: ldply(tickers, function (t) tryCatch(getKeyStats_xpath(t), finally={})) I keep getting this error: Error: could not find function "ldply" The concept definitely makes sense to me, and I've used TryCatch many times in C# & SQL, but I can't seem to get this working in R. I gool –  Dec 15 '15 at 22:37
  • I guess that's the problem. Weird thing is, I just installed it, and got confirmation that it installed fine. But I invoke it and get an error. require(plyr) Loading required package: plyr Warning message: In library(package, lib.loc = lib.loc, character.only = TRUE, logical.return = TRUE, : there is no package called ‘plyr’ –  Dec 15 '15 at 23:16
  • Sorry for dragging this on so long everyone. It seems like there was a problem with 'plyr' yesterday, but it's working today. I just tried this: stats <- ldply(tickers, function(t) tryCatch(getKeyStats_xpath(t), error=function(x) {cat("error occurred\n")})) I get 'error occurred'. I can see why, but I don't know how to handle this exception. Basically, if the ticker doesn't have any 'key statistics', I want to skip it and continue reading other tickers in the array. There must be an easy way to do this, but I don't know what it is. –  Dec 16 '15 at 15:25
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/98150/discussion-between-ryguy72-and-jaimedash). –  Dec 16 '15 at 17:17