0

I'm scraping the commentary on this website : https://fr.trustpilot.com/review/jardiland.com

With this script :

import scrapy

from ..items import JardilandItem


class AmazonSpiderSpider(scrapy.Spider):
    name = 'amazon'
    page_number = 2
    start_urls = ['https://fr.trustpilot.com/review/jardiland.com']

    def parse(self, response):
        items = JardilandItem()

        comm = response.css('.review-content__text::text').extract()

        items['comm'] = comm
        #items['note'] = note


        yield items

        next_page = 'https://fr.trustpilot.com/review/jardiland.com?page='+ str(self.page_number)
        if self.page_number <= 9 : #you can specify any number of pages you, here I specified 3 just for clarity
            self.page_number += 1
            yield response.follow(next_page, callback = self.parse)

And it give me this : output

So far so good, but when I open this file with Excel it give me this : output2

What's the matter ? Even when I c/c, it give me this. I don't understand.

Any ideas ?

Thanks.

P.S : Don't bother with "amazon", I reused an older script.

EDIT : To obtain my csv file, Im' writing "scrapy crawl amazon -o items.csv" in my terminal.

0 Answers0