2

Updated post for csv file structure. csv file has the following structure:

Symbol
AAL
AAON
AAPL
ABCB
Tickers does not have quotation marks.

I have a problem with my code in Python. I am trying to download historical ESG data from Yahoo Finance using yfinance and NASDAQ tickers from .csv file. To download ESG data I am using a code below:

import pandas as pd
import yfinance as yf
import time
from random import randint
import yesg
import requests

# Read in your symbols
nasdaq = pd.read_csv('/path/to/file.csv')

# Endpoint(As far as I am concerned endpoint allows to download historical data from Yahoo)
url = "https://query2.finance.yahoo.com/v1/finance/esgChart"

# List of dataframes
dataframes = []

for symbol in nasdaq["ticker_code"]:
    response = requests.get(url, params={"symbol": symbol})
    if response.ok:
        df = pd.DataFrame(response.json()["esgChart"]["result"][0]["symbolSeries"]
        df["symbol"] = symbol
        dataframes.append(df)

df = pd.concat(dataframes)
df["timestamp"] = pd.to_datetime(df["timestamp"], unit="s")

But there is an invalid synthax error in df["symbol"] = symbol. I can't find out, what could be the reason for the error. By the way, everything is okay with a file path in a row, I just wrote here a sample of a path.

jrpars1
  • 35
  • 4
  • 2
    You haven't closed the brackets in the previous line, so the compiler thinks that you're still inside the brackets of *`pd.DataFrame(....)`* – Driftr95 Apr 07 '23 at 10:13
  • Welcome to Stack Overflow. The problem is caused by a typo, so it is not suitable for a Stack Overflow question. However, people commonly wonder why the syntax error is reported this way, so I have given you a link to the canonical duplicate which explains the problem. The short version is, you will always have to check backwards for missing brackets. Modern text editors and IDEs should help you with this. – Karl Knechtel Apr 07 '23 at 10:39

1 Answers1

1

You forgot to close ) to pd.DataFram(...) but note you have to use headers as parameter of requests else your request will be forbidden (HTTP 403).

url = 'https://query2.finance.yahoo.com/v1/finance/esgChart'
headers = {'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Firefox/60.0'}

# List of dataframes
dfs = {}

for symbol in nasdaq['ticker_code']:
    response = requests.get(url, headers=headers, params={'symbol': symbol})
    data = response.json()
    if response.ok:
        # skip symbol if has no ESG data
        try:
            df = pd.DataFrame(data['esgChart']['result'][0]['symbolSeries'])
            dfs[symbol] = df
        except KeyError:
            pass            

df = pd.concat(dfs, names=['symbol']).reset_index(level='symbol')
df['timestamp'] = pd.to_datetime(df['timestamp'], unit='s')

Output:

>>> df
   symbol  timestamp  esgScore  governanceScore  environmentScore  socialScore
0    AAPL 2014-09-01     61.00            62.00             74.00        45.00
1    AAPL 2014-10-01     60.00            62.00             74.00        45.00
2    AAPL 2014-11-01     61.00            62.00             74.00        45.00
3    AAPL 2014-12-01     61.00            62.00             74.00        45.00
4    AAPL 2015-01-01     61.00            62.00             74.00        45.00
..    ...        ...       ...              ...               ...          ...
91  GOOGL 2022-04-01       NaN              NaN               NaN          NaN
92  GOOGL 2022-05-01     24.32            11.54              1.66        11.12
93  GOOGL 2022-06-01       NaN              NaN               NaN          NaN
94  GOOGL 2022-07-01       NaN              NaN               NaN          NaN
95  GOOGL 2022-08-01     24.14            11.39              1.66        11.10

[384 rows x 6 columns]
Corralien
  • 109,409
  • 8
  • 28
  • 52
  • We've been over this. Please don't answer questions that boil down to typos, and please don't indulge explaining additional problems in the code. This is **not a discussion forum**, so we don't do either of those things here - the question is not suitable for building a Q&A library, and if it were, an answer like this would still not be suitable. – Karl Knechtel Apr 07 '23 at 10:38
  • @KarlKnechtel. I added the "forgot to close" after my initial answer. I believed the problem was the missing headers and the 403 error if you don't use it............. I also changed the dataframe format. – Corralien Apr 07 '23 at 10:39
  • @Corralien ok, I got it. I was trying to use your code, but unfortunately in output I have ESG scores for each letter from the same line with a ticker in csv file. For example, when I write 0 (AAL company in csv) in a 'ticker_code', then in output I have ESG scores for 'A' and 'L' separately. – jrpars1 Apr 07 '23 at 16:16
  • 1
    Can you try my code with `nasdaq = pd.DataFrame({'ticker_code': ['AAPL', 'MSFT', 'AMZN', 'GOOGL']})` please? Does it work? – Corralien Apr 07 '23 at 20:17
  • @Corralien yes, it works! Thanks a lot! But I would like to ask if I can use your code without headers? And is it possible to use a dataframe from csv file with tickers? – jrpars1 Apr 08 '23 at 10:18
  • Is it possible for your to share your current csv file or update your post with the first 5 lines. Open it with a text editor, not excel, I need to show the structure not the display. – Corralien Apr 08 '23 at 15:25
  • 1
    If your header is `Symbol` in nasdaq csv file replace, `ticker_code` by `Symbol` or skip header from csv file and use `nasdaq = pd.read_csv('file.csv', skiprows=1, header=None, names=['ticker_code'])` – Corralien Apr 11 '23 at 08:27
  • @Corralien thanks! Could you tell please how to skip stocks from a tickers list which do not have any info in Yahoo Finance website? Some of the tickers from a list do not have ESG data, so I have an error "symbolSeries" in output. – jrpars1 Apr 15 '23 at 10:59
  • 1
    I updated my answer, just added a try/except block to catch this error. – Corralien Apr 16 '23 at 03:42
  • @Corralien everything is great! Again, thank you for taking the time to do this. – jrpars1 Apr 17 '23 at 09:24