Python Selenium store data to specific column in CSV?

Question

I have two prints that I want to write to a single CSV File into Column A and Column B

My problem is when I print both(first and second print) at the end , I get only an element, multiple times I guess because it's not inside a loop or so.

print((text), (link[0:-9]))

Result :

LMFCIIC PWFERT-BK
LMFCIIC PMFEP-BK
LMFCIIC LMF8CC-BL
LMFCIIC PMFEP-GY
LMFCIIC ASPCP-NV
LMFCIIC LWBASK-PK
LMFCIIC LWBATA-PK
LMFCIIC LWBATOP-PK
LMFCIIC LMF8CC-RD

My first print looks like this : And I want to print it to Column A

PWFERT-BK
PMFEP-BK
LMF8CC-BL
PMFEP-GY
ASPCP-NV
LWBASK-PK
LWBATA-PK
LWBATOP-PK
LMF8CC-RD

My Second print looks like this : And I want to print it to Column B

LMFCIIC


LWBASK
LWBATA
LWBATOP
LMFCIIC

Here is my full code :

from bs4 import BeautifulSoup
from selenium import webdriver
import html5lib
import time
import requests

driver_path = '/usr/local/bin/chromedriver 2'
driver = webdriver.Chrome(driver_path)
driver.implicitly_wait(10)

driver.get('https://www.tenniswarehouse-europe.com/zzz/producttracker_bl.html?ccode=SWIMG030')
try:
    iframe = driver.find_elements_by_tag_name('iframe')
    for i in range(0, len(iframe)):
            f = driver.find_elements_by_tag_name('iframe')[i]
            driver.switch_to.frame(i)
            #  your work to extract link
            text = driver.find_element_by_tag_name('body').text
            text = text.replace("Code: ","")
            text = text.replace("No Copy Images to TW Server","")
            print(text)
            driver.switch_to_default_content()
finally:
    driver.quit()

resp = requests.get('https://www.tenniswarehouse-europe.com/zzz/producttracker_bl.html?ccode=SWIMG030')
soup = BeautifulSoup(resp.text,"lxml")
for frame in soup.findAll('img'):
    link = (frame['src'])
    link = link.split('=')[1] 
    print ((link[0:-9]))

I used www.example.com because the link is not accessible out of my network

You can use pandas library it has method called df.to_csv("filename.csv") using which you can save it to csv — ANISH TIWARI, Aug 15 '18 at 15:06
Don't `print` to CSV. Use the [`csv` package](https://docs.python.org/3/library/csv.html) instead of reinventing the wheel. — zvone, Aug 15 '18 at 15:09
@zvone - would you mind telling me where should I change my code so I can save time not reinventing the wheel ? — Andie31, Aug 15 '18 at 15:18
@Andie31 please solve indent problem in last 4 lines of your code — Nihal, Aug 16 '18 at 10:31
@Nihal I updated the code. There's to #print commands in the code. Line 21, Line 31 - I just don't know how to put everything in a loop :( — Andie31, Aug 16 '18 at 10:40
`resp = requests.get('https://www.example.com') soup = BeautifulSoup(resp.text,"lxml") for frame in soup.findAll('img'): link = (frame['src']) link = link.split('=')[1] print ((link[0:-9]))` tell me the exact purpose of this part of code — Nihal, Aug 16 '18 at 10:41
I need a part of some URLs that are not inside the frame, they are just in the simple html source. So I decided not to use selenium driver for that reason :( does that make sense ? — Andie31, Aug 16 '18 at 10:44
so I can do it both Iframe and regular source in one loop ? Can you help me with that please ? — Andie31, Aug 16 '18 at 10:46

Nihal · Accepted Answer · 2018-08-16T13:31:07.227

1

when you write driver.switch_to.frame(i) you are basically accessing iframe html element. like normal html page you can access its inside element as well.

from your previous question iframe was like

<body>
<a href="http://www.test2.com" target="_blank">
<img src="https://img2.test2.com/LWBAD-1.jpg"></a>
<br/>Code: LWBAD

you can easily access image url by

img_src = driver.find_element_by_tag_name('img').get_attribute('src')

and store that in csv file

code:

from bs4 import BeautifulSoup
from selenium import webdriver
import html5lib
import time
import requests
import csv

driver_path = '/usr/local/bin/chromedriver 2'
driver = webdriver.Chrome(driver_path)
driver.implicitly_wait(10)

driver.get('https://www.example.com')

iframe = driver.find_elements_by_tag_name('iframe')
images = driver.find_elements_by_tag_name('img')
with open('file_name.csv', 'w', newline='') as csvfile:
    field_names = ['text', 'src']
    writer = csv.DictWriter(csvfile, fieldnames=field_names)
    writer.writerow({'text': 'text', 'src': 'src'})
    for i in range(0, len(iframe)):
        f = driver.find_elements_by_tag_name('iframe')[i]
        img_src = images[i].get_attribute('src')

        # do the src splitting here
        img_src = img_src.split('=')[1]

        driver.switch_to.frame(i)

        text = driver.find_element_by_tag_name('body').text


        text = text.replace("Code: ", "")
        text = text.replace("No Copy Images to TW Server", "")
        print(text)
        writer.writerow({'text': text, 'src': img_src})

        driver.switch_to_default_content()
driver.quit()

edited Aug 16 '18 at 13:31

answered Aug 16 '18 at 10:50

Nihal

5,262
7
23
41

if you want further explaination of csv writer you can refer my previous answer with [this link](https://stackoverflow.com/a/51568748/7053679) – Nihal Aug 16 '18 at 10:52
trying to piece all together with the first part of code, having issues :( `AttributeError: module 'csv' has no attribute 'DictWriter'` on line 15 Do u mind checking it ? – Andie31 Aug 16 '18 at 11:10
use `import csv` – Nihal Aug 16 '18 at 11:10
i did :( no luck – Andie31 Aug 16 '18 at 11:11
DictWriter is function of csv library. – Nihal Aug 16 '18 at 11:12
i hope you don't have any file name like csv.py in your project – Nihal Aug 16 '18 at 11:13
can u update the whole code please ? maybe I'm not formating it correctly. – Andie31 Aug 16 '18 at 11:13
hehe, no I don't :) I saw that post already :) looked for a solution :) – Andie31 Aug 16 '18 at 11:14
[https://docs.python.org/3/library/csv.html](https://docs.python.org/3/library/csv.html) – Nihal Aug 16 '18 at 11:20
I checked that official page too already. Let me update my code now, maybe it will make more sense to you. give me a second – Andie31 Aug 16 '18 at 11:21
check the code please, I get that error at line 15 :( I'm sure I did something wrong though – Andie31 Aug 16 '18 at 11:27
1

you forgot to include `iframe = driver.find_elements_by_tag_name('iframe')` – Nihal Aug 16 '18 at 11:28
I was sure it was on me ! Ok, you are super close, Got one more error at the very last line : `TypeError: a bytes-like object is required, not 'str'` Line 33 writer.writerows(link) – Andie31 Aug 16 '18 at 11:32
do u mind if we chat ? would be much easier...I can't create a chat room :( – Andie31 Aug 16 '18 at 11:36
I think it's a misunderstanding in here...those urls, are not a part of the iframe – Andie31 Aug 16 '18 at 11:48
so. where are they? – Nihal Aug 16 '18 at 12:11
they are inside a local page main page...don't know how to explain...they are not embeded into an ``...they are part of a table called `<table id="tabledata">` `<tbody> <tr><td><img src="../../images/505023824.webp"/></td></tr></tbody></table> – Andie31 Aug 16 '18 at 12:27
Not able to post anything on the chat, reason...`You must have 20 reputation on Stack Overflow to talk here` – Andie31 Aug 16 '18 at 12:29
can you join chat.stackoverflow? – Nihal Aug 16 '18 at 12:29
Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/178137/discussion-between-nihal-and-andie31). – Nihal Aug 16 '18 at 12:34
solution is bounded by condition, same no of iframe and img tags. – Nihal Aug 16 '18 at 13:41

Python Selenium store data to specific column in CSV?

1 Answers1

Linked