-1

I would like to scrape some Wikipedia articles for some information. I am having an issue where it seems that when I want to use the csv.writerow function. I may be using it wrong which could be the issue. But I only get it with a particular sequence of values. It works with other cases.

I have used different wikipedia websites and it seems to work fine; however, it does not seem to work when the values are '01'. You can assume I have the write imports.

csv_file = open('wiki_scrape.csv','w')  
csv_writer = csv.writer(csv_file)
csv_writer.writerow(['Title'])

months = ['01','02','03','04','05','06','07','08','09','10','11','12']

years = ['2008','2009','2010','2011','2012','2013','2014','2015','2016','2017','2018']


for i in years:
    for j in months:
        source = requests.get(f'https://en.m.wikipedia.org/wiki/Template:POTD/{i}-{j}-01').text
        soup = BeautifulSoup(source, 'lxml')
        title = soup.body.b.text
        csv_writer.writerow([title])
csv_file.close()

I get a UnicodeEncode Error.

' 'charmap' codec can't encode character '\u0101' in position 8: character maps to undefined'

I was hoping to get a nice dataframe. I removed some of the other values to simplify the problem.

1 Answers1

0

I used this to solve it:

csv_file = open('wiki_scrape.csv','w', encoding = "utf-8")

But I dont know why it works.