0

I use above code to scrape friend list from facebook UID and am getting an error:

  File "C:\Users\Tn\PycharmProjects\untitled\test\1.py", line 15, in friend_uid_list
    soup = from_uid(uid)
  File "C:\Users\Tn\PycharmProjects\untitled\test\1.py", line 11, in from_uid
    driver.get('https://www.facebook.com/' + uid + '/friends')
NameError: name 'driver' is not defined
"""

Can you show me how to fix it ? Thank you very much ! Below code is my code

import multiprocessing
from selenium.common.exceptions import TimeoutException
from bs4 import BeautifulSoup
from selenium import webdriver
from selenium.webdriver.common.by import By

def from_uid(uid):
    driver.get('https://www.facebook.com/' + uid + '/friends')
    return BeautifulSoup(driver.page_source, "html5lib")

def friend_uid_list(uid):
    soup = from_uid(uid)
    friends = soup.find_all("div", class_="fsl fwb fcb")
    target = open('C:/friend_uid_list.txt', 'a')
    for href in friends:
        href = href.find('a')
        try:
            target.write(href + "\n")
        except:
            pass
    target.close()

if __name__ == '__main__':

    driver = webdriver.Firefox()
    driver.get("https://www.facebook.com/")
    driver.find_element_by_css_selector("#email").send_keys("myemail@gmail.com")
    driver.find_element_by_css_selector("#pass").send_keys("mypass")
    driver.find_element_by_css_selector("#u_0_m").click()

    pool = multiprocessing.Pool(3)
    pool.map(friend_uid_list, [100004159542140,100004159542140,100004159542140])
martineau
  • 119,623
  • 25
  • 170
  • 301
user3373322
  • 53
  • 1
  • 7

1 Answers1

0

The reason is simple: You create some new processes, and it can't see the variables in another process(main process).

There are several solutions:

  1. Pass the variables you need as arguments. But this is not possible since driver is not picklable.

  2. Create a new driver for each process.

  3. Use multi-threading instead of multi-processing. However I'm not sure if selenium works this way, you'll have to test it yourself.

laike9m
  • 18,344
  • 20
  • 107
  • 140
  • Why solution #1 is not possible ? Can I change code to pool.map(friend_uid_list, [(100004159542140,driver),(100004159542140,driver),(100004159542140,driver)]) ? – user3373322 Aug 06 '16 at 10:55