0

This goes to a URL from the CSV file , then scrolls down. I'm trying to grab the company URLs off of the page. I can't seem to get it to work. Now if I use just one stand alone URL without pulling it from the CSV, it will print to powershell. Still can't get it to write to CSV.

Here is a couple of URLs that I'm working with:

https://www.facebook.com/search/pages/?q=Los%20Angeles%20remodeling
https://www.facebook.com/search/pages/?q=Boston%20remodeling

The thought I had was that it has to be a loop within a loop. Or, it could be if, elif. I'm not really sure at this point. Any and all suggestions would be appreciated.

import time
from selenium import webdriver
from bs4 import BeautifulSoup as bs
import csv
import requests
from selenium.webdriver.support.ui import WebDriverWait


driver = webdriver.Chrome()
elems = driver.find_elements_by_class_name('_32mo')


chrome_options = webdriver.ChromeOptions()
prefs = {"profile.default_content_setting_values.notifications" : 2}
chrome_options.add_experimental_option("prefs",prefs)
driver = webdriver.Chrome(chrome_options=chrome_options)


driver.get('https://www.facebook.com')
username = driver.find_element_by_id("email")
password = driver.find_element_by_id("pass")
username.send_keys("*****")
password.send_keys("******")
driver.find_element_by_id('loginbutton').click()
time.sleep(2)



with open('fb_urls.csv') as f_input, open('fb_profile_urls.csv', 'w', newline=)  as f_output:
    csv_input = csv.reader(f_input)
    csv_output = csv.writer(f_output)
    for url in csv_input:
        driver.get(url[0])
        time.sleep(5)
        lenOfPage = driver.execute_script("window.scrollTo(0, document.body.scrollHeight);var lenOfPage=document.body.scrollHeight;return lenOfPage;")
        match=False
        while(match==False):
            lastCount = lenOfPage
            time.sleep(1)
            lenOfPage = driver.execute_script("window.scrollTo(0, document.body.scrollHeight);var lenOfPage=document.body.scrollHeight;return lenOfPage;")
            if lastCount==lenOfPage:
                match=True
                for elem in elems:
                    csv_output.(driver.find_elements_by_tag_name('href'))
Craig
  • 4,605
  • 1
  • 18
  • 28
RobK
  • 123
  • 1
  • 10
  • I'm not sure I understand your question, so a wild guess: `open('fb_profile_urls.csv', 'w', newline=)` will wipe the file every time you run this script – roganjosh Sep 26 '18 at 18:24
  • see, that's what i thought as well. But, through research it's what i found in a solution on SO. – RobK Sep 26 '18 at 18:36
  • 1
    No, it _definitely_ wipes the file each time you run your script – roganjosh Sep 26 '18 at 18:41

1 Answers1

0

Instead of opening the file in write mode open('file','w') open it in append mode open('file','a')

Found in how to add lines to existing file using python

G. Anderson
  • 5,815
  • 2
  • 14
  • 21
  • this is almost comical that i just took it at face value that it wouldn't. i know better. even after doing that it's not working. It's just a blank sheet after running – RobK Sep 26 '18 at 18:47
  • Silly question, did you repopulate the file with more data after you did the `'w'` and before you did the `'a'`? When I run `fname = 'testfile1.txt' with open(fname,'w') as f: f.write('Some Junk')` the file contains `Some Junk` Then I run `with open(fname,'a') as f: f.writelines('\nmorejunk')` It contains `Some Junk morejunk` – G. Anderson Sep 26 '18 at 19:38
  • i understand what you're saying. And if i do what you just did, it works fine. It has to be something to do with the loop. I'm not quite sure if it's where i name 'elems' or where i'm writing a 'for' loop at the end to have it print to file. Does that make sense? – RobK Sep 26 '18 at 19:52
  • Where it says `csv_output.(driver...` it doesn;t look like you're calling the `writerow()` function, thus not giving it anything to write – G. Anderson Sep 26 '18 at 20:21
  • tried that...and it doesn't work either. T.T i've been racking my brain for the last few hours on how to get this to work. i think i'm about ready to throw in the towel. – RobK Sep 26 '18 at 20:43
  • When in doubt, check your I/O...can you print `url` and print `elem` in your loops to verify that you're getting what you think you're getting? – G. Anderson Sep 26 '18 at 20:53