3

I am trying to make a http call using DataFusion.

  1. Source - GCS - csv file
  2. Sink - HTTP POST

API is expecting the file as part of the HTTP request.

enter image description here

enter image description here

When this is executed, I get the below error in the API logs.

Required request part 'file' is not present

How can this be achieved?

aruna j
  • 91
  • 5
  • Did you follow any tutorial? How did you enabled this plugin? Could you provide all steps you did to replicate this issue? Are you getting any error? Are you using GCP CLI? – PjoterS Sep 08 '21 at 10:42
  • 1
    Using Data Fusion. Deployed Http Plugin from Hub. [link]https://cloud.google.com/data-fusion/plugins – aruna j Sep 08 '21 at 16:15
  • And how are you using it? Are you getting any error? Can you provide some examples to replicate? – PjoterS Sep 10 '21 at 10:51
  • Question has been edited for better understanding. – aruna j Sep 15 '21 at 09:41
  • Could you please provide more details/steps you have followed? I'd like to replicate this on my environment. Is this full error message? – PjoterS Sep 16 '21 at 17:09
  • I have a CSV file in GCS bucket and I am trying to push the file to HTTP end point url. The endpoint url accepts only file as part of the request. – aruna j Sep 17 '21 at 07:31
  • What do you mean file as part of request? You have tried to configure it only via UI? – PjoterS Sep 20 '21 at 17:34
  • The endpoint url is expecting the file to be sent in the request. Just like how we upload files to sftp. – aruna j Sep 21 '21 at 05:12

1 Answers1

0

You can do it using the 'HTTP plugin', but you will only receive a text response inside the HTTP body with the file content.

In the Data Fusion, you should create pipeline (you've already did it, but you can try newer version) enter image description here

GCS Configuration:

For an endpoint, I’ve created a VM in Google Compute Engine.

HTTP Plugin Configuration:

Before running sink you should install some kind of HTTP service, for example Tornado Web Server

$ sudo apt install python
$ sudo apt install python-pip
$ pip install tornado

Create script like below to observe http requests:

#!/usr/bin/env python

import tornado.ioloop
import tornado.web
import pprint

class MyDumpHandler(tornado.web.RequestHandler):
    def post(self):
        pprint.pprint(self.request)
        pprint.pprint(self.request.body)

if __name__ == "__main__":
    tornado.web.Application([(r"/.*", MyDumpHandler),]).listen(8080)
    tornado.ioloop.IOLoop.instance().start()

and run this script using python echo.py or python3 echo.py depending on what you will have on your VM with Web Server.

Below Response:

CSV file contains only 2 rows for test purpose:

Kara
  • 6,115
  • 16
  • 50
  • 57
PjoterS
  • 12,841
  • 1
  • 22
  • 54