-1

I have written a Python script to scrape some products, so far i am able to get everything i need but i am stuck at how to save these values in csv.

Here is my Python code:

import requests
from bs4 import BeautifulSoup

mainPage = requests.get("http://mega.pk/mobiles/")
soup = BeautifulSoup(mainPage.content, "html5lib")

for link in soup.select("ul.asidemenu_h1 a[href*=http://www.mega.pk/mobiles-]"):
    #link.get_text()

    urls = [link.get('href')]

    for url in urls:
        brandPage = requests.get(url)
        soup = BeautifulSoup(brandPage.content, "html5lib")

        for productPage in soup.select("div.lap_thu_box div.image a[href*=http://www.mega.pk/mobiles_products/]"):
            productUrls = [productPage.get('href')]

            for productUrl in productUrls:
                productPage = requests.get(productUrl)
                soup = BeautifulSoup(productPage.content, "html5lib")

                for productName in soup.select("div.col-md-8.col-sd-8.col-xs-8 div.padding-10 div h2 span"):
                    print (productName.get_text())

                for productDesc in soup.select("div#main1 div div div div.row div div p"):
                    print (productDesc.get_text())

                for productPrice in soup.select("div#main1 div div div div.row div div div div.price-n-action div span.desc-price"):
                    print (productPrice.get_text())

and please do tell me how to improve my code, I am pretty new to python. Before i wrote this script i used scrappy which was very fast (took 1 minute to do all) but I want to use python-3 (which takes at least 5-7 minutes). Though time is not the main issue. Saving elements in csv is important.

Mansoor Akram
  • 1,997
  • 4
  • 24
  • 40
  • Not sure how your code is relevant. You just want to print to a file, which there are many reverences to that, no? http://stackoverflow.com/questions/6159900/correct-way-to-write-line-to-file-in-python – ergonaut Sep 29 '15 at 17:26
  • I just gave code so everybody can understand what I am doing, I absolutely know complete code was irrelevant. – Mansoor Akram Sep 29 '15 at 17:27

2 Answers2

2

Use the csv library.

import csv
with open('desired-filename-here.csv', 'w', newline='') as csvfile:
    csvwriter = csv.writer(csvfile, delimiter=' ',
                           quotechar='|', quoting=csv.QUOTE_MINIMAL)
    row_info = [productName.get_text(), productDesc.get_text(), 
                productPrice.get_text()]
    csvwriter.writerow(row_info)
Anarosa PM
  • 141
  • 1
  • 4
2

As an alternative to csv, you could always download pandas

import pandas as pd
myData = {}

... populate the dictionary ...

output = pd.DataFrame(myData)
output.to_csv('output_data.csv')

granted there are existing solutions that don't require you to download anything, but I certainly find pandas convenient, and it's particularly handy at organizing data quickly :^)

Kanga_Roo
  • 297
  • 5
  • 13