6

I'm working on this function that scrapes a website for fantasy football information and writes it to an Excel file. Ultimately, I want to have information for each week on a separate sheet in the Excel workbook.

The code as posted below works perfectly until I want to write it to the Excel workbook. The workbook ends up having just week 17 data. It seems that the pandas ExcelWriter overwrites the sheet every time when I really want it to add a sheet every time.

I couldn't find anything online about creating sheets with a loop in the pandas ExcelWriter, so I'm not entirely sure if it can be done the way I want it.

import bs4 as bs
import urllib.request
import pandas as pd
from pandas import ExcelWriter    

for week in range(1,18):
    #IGNORE MOST OF THIS STUFF BELOW BECAUSE IT WORKS AS IS
    source = urllib.request.urlopen('http://fftoday.com/stats/playerstats.php?Season=2015&GameWeek='+str(week)+'&PosID=10&LeagueID=1').read()
    soup = bs.BeautifulSoup(source, 'lxml')

    table = soup.find('table', width='100%', border='0', cellpadding='2', cellspacing='1')
    table_rows = table.find_all('tr')

    player_data = {}

    for tr in table_rows:
       td = tr.find_all('td')
       row = [i.text for i in td]
       if row != ['\xa0 ', 'Passing', 'Rushing', 'Fantasy'] and row != ['Player\nSort First: \n\n\n\xa0\xa0\n\t\tLast: \n\n\n', 'Team\n\n\n\n', 'G\n\n\n\n', 'Comp\n\n', 'Att\n\n', 'Yard\n\n', 'TD\n\n', 'INT\n\n', 'Att\n\n', 'Yard\n\n', 'TD\n\n', 'FPts\n\n\n\n', 'FPts/G\n\n\n\n']:
           names = str(row[0]).encode('utf-8')
           names = str(names)[:-1].split()[1:]
           names[0:] = [' '.join(names[0:])]
           row[0] = names[0]
           player_data[str(row[0])] = row[1:]
    df_qb = pd.DataFrame.from_dict(player_data)
    df_qb = df_qb.transpose()
    df_qb.columns = ['Team', 'Games', 'Completions', 'Att', 'Yards',
                'TD', 'INT', 'Rush Att', 'Rush Yards', 'Rush TD',
                'Fantasy Points', 'Fantasy Points per Game']

    #EVERY THING WORKS JUST FINE UNTIL RIGHT HERE
    writer = ExcelWriter('Weekly Fantasy Football Data.xlsx')
    df_qb.to_excel(writer, 'Week ' + str(week))
    writer.save()
Bhargav Rao
  • 50,140
  • 28
  • 121
  • 140
mynameisgooch
  • 95
  • 1
  • 1
  • 8

2 Answers2

7

All you need to do is move the writer decleration outside of the loop

e.g.

a = numpy.arange(5)
w = ExcelWriter('e.xlsx')
for i in a:
    df = pd.DataFrame({'a':np.random.randint(1,100,10)})
    df.to_excel(w, sheet_name='sheet '  + str(i))
    w.save()
Ajeet Ganga
  • 8,353
  • 10
  • 56
  • 79
  • 3
    Thanks for getting me on the right track! I moved the writer declaration out of the loop, but that didn't quite work. I was able to get it working by moving the save() declaration out of the loop as well. – mynameisgooch Dec 21 '16 at 01:06
  • Awesome. But I would suggest to save more periodically, perhaps within loop and close outside of loop. – Ajeet Ganga Dec 21 '16 at 06:26
5
import numpy
import pandas as pd
from pandas import ExcelWriter

a = numpy.arange(5)
w = ExcelWriter('e.xlsx')
df_list = []
for i in a:
    df = pd.DataFrame({'a': numpy.random.randint(1, 100, 10)})
    df_list.append(df)
for i, df in enumerate(df_list):
    df.to_excel(w, sheet_name=f'sheet{i}')
w.save()