Data Fusion - Issue with http post plugin

Question

I am trying to make a http call using DataFusion.

Source - GCS - csv file
Sink - HTTP POST

API is expecting the file as part of the HTTP request.

When this is executed, I get the below error in the API logs.

Required request part 'file' is not present

How can this be achieved?

Did you follow any tutorial? How did you enabled this plugin? Could you provide all steps you did to replicate this issue? Are you getting any error? Are you using GCP CLI? — PjoterS, Sep 08 '21 at 10:42
Using Data Fusion. Deployed Http Plugin from Hub. [link]https://cloud.google.com/data-fusion/plugins — aruna j, Sep 08 '21 at 16:15
And how are you using it? Are you getting any error? Can you provide some examples to replicate? — PjoterS, Sep 10 '21 at 10:51
Could you please provide more details/steps you have followed? I'd like to replicate this on my environment. Is this full error message? — PjoterS, Sep 16 '21 at 17:09
I have a CSV file in GCS bucket and I am trying to push the file to HTTP end point url. The endpoint url accepts only file as part of the request. — aruna j, Sep 17 '21 at 07:31
What do you mean file as part of request? You have tried to configure it only via UI? — PjoterS, Sep 20 '21 at 17:34
The endpoint url is expecting the file to be sent in the request. Just like how we upload files to sftp. — aruna j, Sep 21 '21 at 05:12

score 0 · Answer 1 · edited Oct 01 '21 at 20:23

You can do it using the 'HTTP plugin', but you will only receive a text response inside the HTTP body with the file content.

In the Data Fusion, you should create pipeline (you've already did it, but you can try newer version)

GCS Configuration:

For an endpoint, I’ve created a VM in Google Compute Engine.

HTTP Plugin Configuration:

Before running sink you should install some kind of HTTP service, for example Tornado Web Server

$ sudo apt install python
$ sudo apt install python-pip
$ pip install tornado

Create script like below to observe http requests:

#!/usr/bin/env python

import tornado.ioloop
import tornado.web
import pprint

class MyDumpHandler(tornado.web.RequestHandler):
    def post(self):
        pprint.pprint(self.request)
        pprint.pprint(self.request.body)

if __name__ == "__main__":
    tornado.web.Application([(r"/.*", MyDumpHandler),]).listen(8080)
    tornado.ioloop.IOLoop.instance().start()

and run this script using python echo.py or python3 echo.py depending on what you will have on your VM with Web Server.

Below Response:

CSV file contains only 2 rows for test purpose:

Why to run http server before running sink? – Zahid Khan Sep 27 '22 at 04:24 — Zahid Khan, Sep 27 '22 at 04:24

Data Fusion - Issue with http post plugin

1 Answers1

Linked