1

I want to download excel file from this link via python https://www.tfex.co.th/tfex/historicalTrading.html?locale=en_US&symbol=S50Z21&decorator=excel&series=&page=4&locale=en_US&locale=en_US&periodView=A

Here is my code:

url = 'https://www.tfex.co.th/tfex/historicalTrading.html?locale=en_US&symbol=S50Z21&decorator=excel&series=&page=4&locale=en_US&periodView=A'

resp = requests.get(url)
with open('file.xls','wb') as f:
    f.write(resp.content)

But the file.xls is instead a html text file. file.xls looks like this.1

I've tried add headers

headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36'}
resp = requests.get(url, headers=headers)

But it didn't help. Thank you in advance.

1 Answers1

0

Edit: Found a way using pandas.

import pandas as pd

url = r'https://www.tfex.co.th/tfex/historicalTrading.html?locale=en_US&symbol=S50Z21&decorator=excel&series=&page=4&locale=en_US&periodView=A'

# read into HTML tables
tables = pd.read_html(url)
# merge HTML tables
merged = pd.concat(tables)
# Write tables to excel file
merged.to_excel("output.xlsx")

Hope this helps :)

Ignore below, this was before edit:

I know this is still problematic depending on your downstream application. The code below does seem to still download it into a HTML format, but this format can be opened in excel regardless.

import requests
url = r'https://www.tfex.co.th/tfex/historicalTrading.html?locale=en_US&symbol=S50Z21&decorator=excel&series=&page=4&locale=en_US&periodView=A'

r = requests.get(url, allow_redirects=False)
excel_url = r.url
open('out.xls', 'wb').write(r.content)

When I open this in excel I get a warning, and click okay.

screenshot of file

Olaf
  • 146
  • 1
  • 8