2

I have used one of the methods described here Python write to CSV line by line to attempt to write all the lines of my output to a .CSV. I've managed to get it to the stage of outputting and generating the CSV, but instead of showing all the lines of my data I am seeing one line, repeated 4 times and nothing else.

Can anyone see what the issue is here?

from bs4 import BeautifulSoup
import requests
import csv

headers = {'User-Agent': 'Mozilla/5.0'}

for i in range(1, 300):
    url = "xxx?page=%s" % i

    response = requests.get(url, headers=headers)
    soup = BeautifulSoup(response.text, "html.parser")
    items = soup.find_all('div', class_='product-block__info')
    for item in items:
        product = item.find('span', class_='short_desc').text
        stock = item.find('span', class_='count_product_stock hidden').text
        brand = item.find('h4', class_='brand').text
        price = item.find('span', class_='selling_price').text

        # create a list of all the fields    
        sheets = [brand, product, stock, price]

        print(sheets)

        with open('csvfile.csv','wt') as file:
            for l in sheets:
                file.writelines(sheets)
                file.write('\n')
Ben P
  • 3,267
  • 4
  • 26
  • 53
  • Add a print in your for loop that writes lines and you'll figure it out. Also realize the file will be truncated each time you open it. – Mark Tolonen Sep 21 '17 at 14:45
  • I've printed out the lines, but it's still not clear to me what's happening to get the result in the .csv? – Ben P Sep 21 '17 at 15:43
  • `sheets` is a single line. `for l in sheets:` iterates over the items in the line, but `l` is never used. `file.writelines` is incorrect for a single line. `file.write('\n')` isn't needed. `csv` will manage the lines. Didn't you wonder why you aren't getting any commas in your csv? – Mark Tolonen Sep 21 '17 at 16:05
  • Each time you `open`, you will erase the previous content of the file as well. – Mark Tolonen Sep 21 '17 at 16:06

1 Answers1

1

You probably want something more like the following untested code. The example provided can't be run as is:

from bs4 import BeautifulSoup
import requests
import csv

headers = {'User-Agent': 'Mozilla/5.0'}

# Open the file once.  See the csv documentation for the correct way to open
# a file for use with csv.writer.  If you plan to open the .csv with
# Excel, the utf-8-sig encoding will allow non-ASCII to work correctly.
with open('csvfile.csv','w', encoding='utf-8-sig', newline='') as f:
    file = csv.writer(f)  # actually use the CSV module.

    for i in range(1, 300):
        url = "xxx?page=%s" % i

        response = requests.get(url, headers=headers)
        soup = BeautifulSoup(response.text, "html.parser")
        items = soup.find_all('div', class_='product-block__info')
        for item in items:
            product = item.find('span', class_='short_desc').text
            stock = item.find('span', class_='count_product_stock hidden').text
            brand = item.find('h4', class_='brand').text
            price = item.find('span', class_='selling_price').text

            # create a list of all the fields    
            sheets = [brand, product, stock, price]

            # write a single line.
            file.writerow(sheets)

Here's a tested example that will open in Excel. I threw in a non-ASCII character and a comma in the data to demonstrate the csv module's ability to handle it:

#coding:utf8
import csv

with open('csvfile.csv','w', encoding='utf-8-sig', newline='') as f:
    file = csv.writer(f)
    file.writerow('BRAND PRODUCT STOCK PRICE'.split())
    for i in range(1,11):
        sheets = ['brand{}'.format(i),'pröduct{}'.format(i),'st,ock{}'.format(i),'price{}'.format(i)]
        file.writerow(sheets)

Output:

BRAND,PRODUCT,STOCK,PRICE
brand1,pröduct1,"st,ock1",price1
brand2,pröduct2,"st,ock2",price2
brand3,pröduct3,"st,ock3",price3
brand4,pröduct4,"st,ock4",price4
brand5,pröduct5,"st,ock5",price5
brand6,pröduct6,"st,ock6",price6
brand7,pröduct7,"st,ock7",price7
brand8,pröduct8,"st,ock8",price8
brand9,pröduct9,"st,ock9",price9
brand10,pröduct10,"st,ock10",price10

In Excel:

Excel image

Mark Tolonen
  • 166,664
  • 26
  • 169
  • 251
  • Is there a way to print a confirmation message once all the lines have been written to the file? – Ben P Sep 22 '17 at 09:01