1

What I'm trying to do:

I'm using Python to populate data to an existing Excel file.

What works:

My code below is successful in exporting the table to Excel file "Futures.xls".

What doesn't work:

The code below extracts table from website and exports to Futures excel file ( 100 Rows )The subsequent code re-opens Futures file and appends to Futures1 excel file ( 200 Rows ). However, if I continue to the run the subsequent code multiple times I cannot get more than 200 rows like 300 rows and so on as an when I run the code. Can someone tell me the issue?

My code:

from urllib.request import urlopen
from bs4 import BeautifulSoup
import requests
import pandas as pd
from pandas import ExcelWriter
from pandas import ExcelFile
import os

url = "https://quotes.ino.com/exchanges/contracts.html?r=NYMEX_NG"
res = requests.get(url)
soup = BeautifulSoup(res.text, 'lxml')

Markets = []
Contracts =[]
Opens =[]
Highs =[]
Lows =[]
Lasts=[]
Changes=[]
Pcts=[]

data_rows = soup.findAll('tr')[3:]

for td in data_rows[:100]:
    Market = td.findAll ('td')[0].text
    Markets.append(Market)
    Contract = td.findAll('td')[1].text
    Contracts.append(Contract)
    Open = td.findAll('td')[2].text
    Opens.append(Open)
    High = td.findAll('td')[3].text
    Highs.append(High)
    Low = td.findAll('td')[4].text
    Lows.append(Low)
    Last = td.findAll('td')[5].text
    Lasts.append(Last)
    Change = td.findAll('td')[6].text
    Changes.append(Change)
    Pct = td.findAll('td')[7].text
    Pcts.append(Pct)
    Time = td.findAll('td')[8].text

    df = pd.DataFrame({'Contracts' :Contracts, 'Markets':Market,'Open':Opens, 'High':Highs, 'Low':Lows,'Last':Lasts,'Pct':Pcts})

    out_path = "C:\Sid\Futures.xls"
    writer = pd.ExcelWriter(out_path , engine='xlsxwriter')
    df.to_excel(writer,'Sheet2',index=False)
    writer.save()

    from openpyxl import load_workbook
    from openpyxl.utils.dataframe import dataframe_to_rows

    wb = load_workbook("C:\Sid\Futures.xlsx")
    ws = wb['Sheet2']

    from openpyxl import load_workbook
    from openpyxl.utils.dataframe import dataframe_to_rows

    wb = load_workbook("C:\Sid\Futures.xlsx")
    ws = wb['Sheet2']

    for row in dataframe_to_rows(df, header=None, index = True):
    ws.append(row)
    wb.save('C:\Sid\Futures1.xlsx')

Additional:

Also, what code do I need so that my python runs automatically when my website updates? Prices change every 15 minutes.

  • 1
    What's stopping you from appending to the sheet? At worst, can you read the existing file in as a dataframe and append to it? – Jacques Kvam Aug 27 '18 at 07:15
  • I see no effort in the provided code to do what you're looking for. When you talk about "what doesn't work" that usually suggests that you've tried to do it and are getting an error. In this case it seems like you want us to write what doesn't work for you? – miken32 Aug 29 '18 at 16:40
  • I have added the section of the code that appends to existing file but its not working. Can someone please help? – Siddharth Kulkarni Sep 03 '18 at 12:43
  • In answer to your last question: you need to run a cron job. See [this answer](https://stackoverflow.com/questions/11774925/how-to-run-a-python-file-using-cron-jobs) for help. – Josh Friedlander Sep 03 '18 at 12:48

0 Answers0