0

I have a function that I use to get financial data from the Wall Street Journal website. Basically I want to make a copy of the data held in symData and give it a name the same as symbol. That means the objects are in the workspace and can be reused for looking at other information. I don't want to keep them permanently so creating temp files on the filesystem is not my favoured method.

The problem I have is that I can't figure out how to do it.

    library(httr)
    library(XML)
    library(data.table)
    getwsj.quotes <- function(symbol) 
    {
        myUrl <- sprintf("https://quotes.wsj.com/AU/XASX/%s/FINANCIALS", symbol)
        symbol.data <- GET(myUrl)   
        x <- content(symbol.data, as = 'text')
        wsj.tables <- sub('cr_dataTable cr_sub_capital', '\\1', x)
        symData <- readHTMLTable(wsj.tables)
        mytemp <- summary(symData)
        print(mytemp)
        d2e <- gsub('^.* ', '', names(symData[[8]]))
        my.out <- sprintf("%s has Debt to Equity Ratio of %s", symbol, d2e)
        print(my.out)
    }
    TickerList <- c("AMC", "ANZ")
    for (Ticker in TickerList)
    {   
        Ticker.Data <- lapply(Ticker, FUN = getwsj.quotes)
    }

The Ticker.Data output is:

> Ticker.Data
[[1]]
[1] "ANZ has Debt to Equity Ratio of 357.41"

The output from mytemp <- summary(symData) has the following:

     Length Class      Mode
NULL 12     data.frame list
NULL  2     data.frame list
...

I tried various ways of doing it when I call the function and all I ever get is the last symbols data. I have searched for hours trying to get an answer but so far, no luck. I need to walk away for a few hours. Any information would be most helpful. Regards Stephen

Stephen
  • 1
  • 3
  • I'm not sure really what the desired output is here. But in general you are going to be much better of putting everything into a list rather than creating a bunch of different variables in your global environment. Your for+lapply code seems a bit odd. You typically you skip the `for` loop and just do `Ticker.Data <- lapply(TickerList, FUN = getwsj.quotes)` to obtain your list of results. Your function just needs to return all the data you want to store (rather than just `print()` it) – MrFlick Feb 22 '19 at 04:06
  • The print is only for seeing the output and has no value (more of a debug if you like). When I do the lappy, the data is just too much if I return everything I want. Eg symData for 500 stocks is full of rubbish (mostly NULL and has no names associated with the data because of what is supplied from WSJ. I want to break the data down into manageable objects. So I would like an object called ANZ that has all the ANZ information that WSJ provides. If my test for say Debt to Equity passes 100 stocks, I want to look at those 100 objects in manageable chunks not a big unwieldy file full of NULLs. – Stephen Feb 22 '19 at 04:55

1 Answers1

0

Edited: I changed my answer based on the suggestion by @MrFlick. It solved another problem.

library(httr)
library(XML)
library(data.table)
getwsj.quotes <- function(Symbol) 
{
        MyUrl <- sprintf("https://quotes.wsj.com/AU/XASX/%s/FINANCIALS", Symbol)
        Symbol.Data <- GET(MyUrl)   
        x <- content(Symbol.Data, as = 'text')
        wsj.tables <- sub('cr_dataTable cr_sub_capital', '\\1', x)
        SymData <- readHTMLTable(wsj.tables)
        return(SymData)       
}
TickerList <- c("AMC", "ANZ", "BHP", "BXB", "CBA", "COL", "CSL", "IAG", "MQG", "NAB", "RIO", "S32", "SCG", "SUN", "TCL", "TLS", "WBC", "WES", "WOW", "WPL")
SymbolDataList <- lapply(TickerList, FUN = getwsj.quotes)

Thanks again.

Stephen
  • 1
  • 3
  • Using assign is not recommended: https://stackoverflow.com/questions/17559390/why-is-using-assign-bad. It’s going to be much more difficult to use all those objects later. The more R like way to do thinks is to use a list. – MrFlick Feb 22 '19 at 14:50
  • I can see where you are coming from. I created a list with all the elements and it solved one of the other more complex problems. Each stock may not have the same amount of data returned from WSJ. As an example: > summary(symbolDataList) Length Class Mode [1,] 37 -none- list [2,] 36 -none- list [3,] 39 -none- list #3 is a bank and it does not have Total Debt available on this page. Most of the imported stuff falls into the three lengths. I can check the length of the list element and make a decision where to find Total Debt somewhere else. Thanks for your help. – Stephen Feb 23 '19 at 00:21