1

I have been working on a small project which is a web-crawler template. Im having an issue in pycharm where I am getting a warning Unresolved attribute reference 'domain' for class 'Scraper'

from abc import abstractmethod

import requests
import tldextract


class Scraper:
    scrapers = {}

    def __init_subclass__(scraper_class):
        Scraper.scrapers[scraper_class.domain] = scraper_class # Unresolved attribute reference 'domain' for class 'Scraper'

    @classmethod
    def for_url(cls, url):
        k = tldextract.extract(url)
        # Returns -> <scraper.SydsvenskanScraper object at 0x000001E94F135850> & Scraped BBC News<!DOCTYPE html><html Which type annotiation?
        return cls.scrapers[k.registered_domain](url)

    @abstractmethod
    def scrape(self):
        pass


class BBCScraper(Scraper):
    domain = 'bbc.co.uk'

    def __init__(self, url):
        self.url = url

    def scrape(self):
        rep = requests.Response = requests.get(self.url)
        return "Scraped BBC News" + rep.text[:20]  # ALL HTML CONTENT


class SydsvenskanScraper(Scraper):
    domain = 'sydsvenskan.se'

    def __init__(self, url):
        self.url = url

    def scrape(self):
        rep = requests.Response = requests.get(self.url)
        return "Scraped Sydsvenskan News" + rep.text[:20]  # ALL HTML CONTENT


if __name__ == "__main__":
    URLS = ['https://www.sydsvenskan.se/', 'https://www.bbc.co.uk/']
    for urls in URLS:
        get_product = Scraper.for_url(urls)
        r = get_product.scrape()
        print(r)

Of course I could ignore it as it is working but I do not like to ignore a warning as I believe pycharm is smart and should solve the warning rather than ignoring it and I wonder what is the reason of it warns me regarding that?

PythonNewbie
  • 1,031
  • 1
  • 15
  • 33
  • Note for here and the other https://stackoverflow.com/questions/67669212/how-to-call-correct-class-from-url-domain : You may think about [accepting an answer](https://stackoverflow.com/help/someone-answers) to reward the most useful answer – azro May 24 '21 at 17:46
  • @azro Sorry! I didnt accepted the answer as I thought I did and I was the one that give the +1 haha! just accepted ❤️ – PythonNewbie May 24 '21 at 17:49

3 Answers3

3

There are a few different levels on how you can remove this warning:

  • Assign a default value:
class Scraper:
    scrapers = {}
    domain = None # Or a sensible value of one exists

  • You can in additon or alternatly annotate the type.
from typing import ClassVar

class Scraper:
    scrapers: ClassVar[dict[str, 'Scraper']] = {}
    domain: ClassVar[str]

Note that ClassVar is required because otherwise it is assume that they are instance attributes.

MegaIng
  • 7,361
  • 1
  • 22
  • 35
  • Oh so basically the "fix" is to declare the attribute before the init subclass? – PythonNewbie May 24 '21 at 17:50
  • I do get an error when using `scrapers: ClassVar[dict[str, Scraper]] = {}` >>> `Unresolved reference 'Scraper' ` – PythonNewbie May 24 '21 at 18:03
  • @ProtractorNewbie Sorry. You need to either for `from __future__ import annotations` or put `Scraper` in a string (like I edited the answer) – MegaIng May 24 '21 at 18:05
  • Is there a reason why to use one over the other? Just curious – PythonNewbie May 24 '21 at 18:24
  • @ProtractorNewbie `from __future__ import annotations` will be default behavior in I think Python3.11? It also means that you don't have to put a lot of different annotations in strings, e.g. less effort. – MegaIng May 24 '21 at 19:14
  • Oh, Im using 3.8 myself, maybe thats why I need to import the annotations? Unless its my pycharm that screams at me – PythonNewbie May 24 '21 at 19:14
  • 1
    @ProtractorNewbie Python3.11 is not out jet. The next version to come out (3.10) will come in a few months. `from __future__ import annotations` is a magic import that prevents type annotations (e.g. most stuff after a `:`) to not be evaluated. That means you can 'reference' stuff even if it doesn't exist at that moment. Here is a more complete explanation: https://stackoverflow.com/questions/61544854/from-future-import-annotations – MegaIng May 24 '21 at 19:17
3

To ignore it, put

# noinspection PyUnresolvedReferences

on the line above the line causing the warning.

Joshua Wolff
  • 2,687
  • 1
  • 25
  • 42
0

Just tell your Scraper class that this attribute exists:

class Scraper:
    scrapers = {}
    domain: str

    def __init_subclass__(scraper_class):
        Scraper.scrapers[scraper_class.domain] = scraper_class
cezar
  • 11,616
  • 6
  • 48
  • 84
azro
  • 53,056
  • 7
  • 34
  • 70
  • 2
    Honestly, this is a bad answer: spelling mistakes, doesn't explain fully the problem and doesn't use type hints correctly. – MegaIng May 24 '21 at 17:47