0

So am trying to implement multiprocessing in my Flask app that implements RSS/Atom feed parsing and i've run into a new error.

Here's my very rudimentary-looking code, the parsing function:

def parallelParse(obj1,obj2, app):
    parsed = []
    d = feedparser.parse(obj1)
    modified = time.mktime(d.feed.get('modified_parsed', None))

    if modified != obj2.feedModified:
        obj2.feedModified = modified

        with app.app_context:
            db.session.commit()

and the view function:

@feeds.route('/')
def home():
    #Query the feed URL and Modified records and store them in a list of tuples

    FEEDS = db.session.query(Feed).all()

    try:
        p = multiprocessing.Process(target=parallelParse, args=((item.feedURL,item) for item in FEEDS), kwargs={'app' : current_app._get_current_object()})
        p.start()
        p.join()
    except AttributeError as e:
        print(e)

    feed_entries = FeedEntryTest.query.order_by(FeedEntryTest.TestEntryTime.desc()).all()
    return render_template("main.html", feed_entries = feed_entries)

Everytime I try to run this however I keep getting the following error:

Can't pickle local object 'SQLAlchemy.init_app.<locals>.shutdown_session'
127.0.0.1 - - [03/Apr/2018 12:50:02] "GET / HTTP/1.1" 200 -
127.0.0.1 - - [03/Apr/2018 12:50:02] "GET /static/css/bootstrap.min.css HTTP/1.1" 200 -
127.0.0.1 - - [03/Apr/2018 12:50:02] "GET /static/css/custom.css HTTP/1.1" 200 -
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "c:\users\imran-pc\appdata\local\programs\python\python36\Lib\multiprocessing\spawn.py", line 105, in spawn_main
    exitcode = _main(fd)
  File "c:\users\imran-pc\appdata\local\programs\python\python36\Lib\multiprocessing\spawn.py", line 115, in _main
    self = reduction.pickle.load(from_parent)
EOFError: Ran out of input

After doing some bit of research I learned that only certain types of objects are 'picklable'. So I tried converting my query result into a dict as shown here but I got another error whilst doing so.

Am I approaching the problem the wrong way? What else should I try to solve this problem?

Imran Said
  • 215
  • 3
  • 13
  • 2
    You cannot pickle your app context. It does not matter if you put it in a dictionary or not. You should redesign your program in such a way that you do not need the context in your subprocesses. Offload only those operations to processes that do not need to interface your application context. – Hannu Apr 03 '18 at 11:21
  • It would seem that was the case after all, after removing the call to commit db changes which necessitated the app context in the first place, the error appears to be gone. Thank you for the help! Back to the design board for me I guess. – Imran Said Apr 03 '18 at 13:06
  • If you only need database connections in your subprocesses, start using SQLAlchemy and open a new connection to the database in each subprocess. This is the correct way of doing it instead of trying to share a common handle between processes or threads. – Hannu Apr 03 '18 at 14:43

0 Answers0