0

I have a Python class which takes an url in parameter and launches a crawler on a news website.

Once the creation of the object is finished, the object is stored in a Elasticsearch cluster.

I want to create a method that takes in input the Elasticsearch document, and creates an object from it.

class NewsArticle():

    def __init__(self, url):
        self.url = url
        # Launch a crawler and fill in the other fields like author, date, ect ...

    @classmethod
    def from_elasticsearch(cls, elasticsearch_article):
        document = elasticsearch_article['_source']
        obj = cls(document['url'])
        obj.url = document['url']
        obj.author = document['author']
        .
        .
        .

The problem is, when I'm calling...

# response is my document from elasticsearch
res = NewsArticle.from_elasticsearch(response)

...the method __init__ will be called and will launch my crawler. Is there anyway that it doesn't launch my crawler or call the init method?

Mathieu Rodic
  • 6,637
  • 2
  • 43
  • 49
mel
  • 2,730
  • 8
  • 35
  • 70
  • 1
    So you want to create an object without initializing an object? – Steven Summers Feb 13 '17 at 10:39
  • 1
    Maybe you should not have that crawler stuff in your `__init__`. – khelwood Feb 13 '17 at 10:41
  • @StevenSummers I would like two know if I can have 2 differents constructors – mel Feb 13 '17 at 10:43
  • 1
    No, `python` doesn't support overloading methods. It does however let you provide optional arguments and you can pass a flag (`bool`) to determine other actions. Or use other methods to set values. Or as khelwood mentioned, re-structure your code so it runs when you call it to. – Steven Summers Feb 13 '17 at 10:52

1 Answers1

1

How about a simple if and a default parameter crawl:

class NewsArticle():

    def __init__(self, url, crawl=True):
        self.url = url
        if crawl:
            # Launch a crawler and fill in the other fields like author, date, ect ...

    @classmethod
    def from_elasticsearch(cls, elasticsearch_article):
        document = elasticsearch_article['_source']
        obj = cls(document['url'], crawl=False)
        obj.url = document['url']
        obj.author = document['author']
Mike Müller
  • 82,630
  • 20
  • 166
  • 161
  • I did what you say and I also took my crawler out of the init in a class method called from_crawled. The if statement will call from_crawled or from_elasticsearch depending of the argument in the init. Is that a good architecture? Should I use @classmethod? – mel Feb 13 '17 at 11:02
  • Sounds good to me. The use of class method also looks reasonable. – Mike Müller Feb 13 '17 at 11:12
  • Which do you recommend: using my init method only to define the type of my attributes and using class method to create my objects or putting a if statement in my init, adding the methodology and the response from Elasticsearch as parameters? – mel Feb 13 '17 at 11:16
  • I would allow to make fully working instance without the need of a class method. So an if in `__init__` seems fine. – Mike Müller Feb 13 '17 at 13:52