-2

I'm getting the Attribute error: 'str' object has no attribute 'find_all'

I followed the post below but it does not help. I get the error only when the line print(a['title']) is included. I tried encode("utf-8") it does not solve.

UnicodeEncodeError: 'charmap' codec can't encode characters

Code is below. It started working without any changes to it today! I do have a duplicate code below doing the find_all, it was there before too, I don't know which one worked.

import requests # pip install requests
import bs4 # pip install BeautifulSoup4
from bs4 import BeautifulSoup

import pandas as pd # pip install pandas
import time
import io

def sc_data():

    URL = "www.website.com"
    #soup = BeautifulSoup(page.text, "html.parser").encode("utf-8")
    soup = BeautifulSoup(page.text, "html.parser")
    jobs = []
        for div in soup.find_all('div', attrs={'class':'row'}):
            for a in div.find_all('a', attrs={'data-tn-element':'jobTitle'}):
                print(a['title'])

jobs = []
    for div in soup.find_all(name='div', attrs={'class':'row'}):
        for a in div.find_all(name='a', attrs={'data-tn-element':'jobTitle'}):
            jobs.append(a["title"])
            return(jobs)
    print(jobs)

def main():
    sc_data()

main()

I am doing the basic web-scraping. It gets stuck between unable to read codec char'u\2013 and the above error alternatively.

  • The error in the title is probably due to attempting to treat a string as a BeautifulSoup object - you may have better luck making your first query more specific, rather than iterating twice. Additionally, consider using Python 3 if possible, as it has tremendously better unicode support! – ti7 Jan 27 '19 at 09:42

1 Answers1

0

Your question lacks some details, poorly documented.

Seems like you are using windows machine for development.

You can follow my following suggestion which may likely solve your problem or document more details about your code.

  1. Step 1:

    • encode to utf-8 while fetching from remote server.
  2. Step 2:

    • Decode to utf-8 while loading.
Debendra
  • 1,132
  • 11
  • 22