3

I am looking for a way to get the holding list of an ETF via a web service such as yahoo finance. So far, YQL has not yielded the desired results.

As an example ZUB.TO is an ETF that has holdings. here is a list of the holdings by querying the yahoo.finance.quotes we do not get the proper information. The result.

Is there another table somewhere that would contain the holdings?

giannis christofakis
  • 8,201
  • 4
  • 54
  • 65
  • Can you clarify how the results shown at pastebin differ from what you're seeking? This will save people from wading through the JSON return value. – JoelC Oct 03 '14 at 20:06
  • Of course! I am seeking something like {holdings=[{name="C" volume="8.1%"},{name="WFC" volume="8%"},{name="MTB" volume="7.78%"},{name="JPM" volume="7.73%"}...]} the second link in my original post will show a top 10 holdings table, that is what I want to get. – Matthew Khouzam Oct 04 '14 at 00:29

1 Answers1

1

Perhaps downloading from Yahoo Finance is not working and/or may not work.

Instead how about using the various APIs the ETF providers already have for downloading the Excel or CSV files of the holdings?

Use the "append_df_to_excel" file as file to import, and then use the code below to make Excel file for all the 11 Sector SPDRs provided by SSgA (State Street global Advisors).

Personally I use this for doing breadth analysis.

import pandas as pd
import append_to_excel 
# https://stackoverflow.com/questions/20219254/how-to-write-to-an-existing-excel-file-without-overwriting-data-using-pandas

##############################################################################
# Author: Salil Gangal
# Posted on: 08-JUL-2018
# Forum: Stack Overflow
##############################################################################

output_file = 'C:\my_python\SPDR_Holdings.xlsx'
base_url = "http://www.sectorspdr.com/sectorspdr/IDCO.Client.Spdrs.Holdings/Export/ExportExcel?symbol="

data = {                    
    'Ticker' :      [ 'XLC','XLY','XLP','XLE','XLF','XLV','XLI','XLB','XLRE','XLK','XLU' ]          
,   'Name' :    [ 'Communication Services','Consumer Discretionary','Consumer Staples','Energy','Financials','Health Care','Industrials','Materials','Real Estate','Technology','Utilities' ]           
}                   

spdr_df = pd.DataFrame(data)     

print(spdr_df)

for i, row in spdr_df.iterrows():
    url =  base_url + row['Ticker']
    df_url = pd.read_excel(url)
    header = df_url.iloc[0]
    holdings_df = df_url[1:]
    holdings_df.set_axis(header, axis='columns', inplace=True)
    print("\n\n", row['Ticker'] , "\n")
    print(holdings_df)
    append_df_to_excel(output_file, holdings_df, sheet_name= row['Ticker'], index=False)

Image of Excel file generated for SPDRs

Salil Gangal
  • 11
  • 1
  • 4
  • 1
    For those that find this in 2022... replace the section of `base_url` value of "IDCO.Client.Spdrs.Holdings" to "IDCO.Client.Spdrs.Portfolio" – Alec Feb 25 '22 at 14:16