0

I got an assignment for which I have to build a API (I use FastAPI) that accepts POST Requests to store raw data in a Mongo database. This data should then be transformed and stored in a different table. The assignment says, it "would be nice" if the API and the transformation are separated (they propose some messaging service between the 2 components, but I have no idea what that is). I did some research and came across the Subprocess functionality of Python, but before I go on with that, I would like to clarify if my assumptions are actually correct. Is it possible to execute a subprocess.run from within the API code, passing the body of the POST request as an argument to the transformation script? If yes, how would be the general Syntax? If there is a better way of doing what I want to do, I am open to anything. Right now, the transformation takes the raw data from the Mongo collection, but I dont know how to trigger the execution of the transform script everytime new raw data is stored to the database.

Thanks a lot in advance!

tripleee
  • 175,061
  • 34
  • 275
  • 318
Moritz
  • 495
  • 1
  • 7
  • 17
  • 2
    You can pass a command-line argument or provide an `input=` parameter to `run` or `communicate`. All of this is well explained in the `subprocess` documentation. – tripleee Mar 21 '22 at 08:58
  • If the assignment proposes a messaging service, why not look it up and use one? – Jiří Baum Mar 21 '22 at 09:09
  • Your textbook or course materials will probably have a description and examples of what's expected; or you can start with something like https://en.wikipedia.org/wiki/Message-oriented_middleware - but better to check the course materials, to match expectations – Jiří Baum Mar 21 '22 at 09:12

1 Answers1

1

Your approach using subprocess is a valid solution for this issue and should run in a single instance of your code.

A more frequently used solution in production is by using redis.

If you are unaware then redis is a very fast NoSQL server side database which can be used to schedule tasks.

You can run redis on your local system (works on linux only, you will need docker to run for windows) and use redis client to access the database.

Creating a task would look like this

redis.client.publish('new-task', json.dumps(
            {"foo", "bar"}))

And subscribing into that task would look like this

    pubsub = redis_conn.pubsub()
    pubsub.subscribe("new-task")

You can check documentation here

Sayanc2000
  • 212
  • 2
  • 6