1

Is there a way in Nifi to run a python script which has modules imported from a different folder, requirements specified in a pipfile and has arguments to pass?

In short, how to execute a python script which usually runs in my virtual environment using Nifi?

The end goal for me is to pick up a file using Get File and post it to API. I tried execute process, execute streamcommand processors.

yawwml
  • 87
  • 1
  • 10
  • I am not sure I follow the question -- NiFi has the capability to pick up a file, read the content, and post the content to an API natively. Why is Python necessary for this task? – Andy Apr 22 '20 at 02:53
  • are you looking for executeScript processor ? https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-scripting-nar/1.5.0/org.apache.nifi.processors.script.ExecuteScript/index.html – maxime G Apr 22 '20 at 09:07
  • Hi @Andy, yes but I have a bunch of processing left after picking up a file and at the moment it is in python.. – yawwml Apr 22 '20 at 13:43

2 Answers2

2

To perform follow-on processing on the flowfile using Python, you can use the ExecuteStreamCommand or ExecuteScript/InvokeScriptedProcessor processors.

The ExecuteStreamCommand processor will run an external shell command, like python3 my_python_script.py -arg1 string -arg2 213, which can wrap custom Python code and uses STDIN to pass the existing flowfile content and STDOUT to capture the new flowfile content. Populate the Command Arguments and Command Path properties of the processor to locate your python executable and provide CLI arguments, including flowfile attributes via NiFi Expression Language. See this answer for an example.

The ExecuteScript processor runs Jython code (Python but without access to native libraries, only Python 2.7 compatibility, and some other restrictions due to JSR-223) in the same JVM as NiFi. You can process the flowfile attributes and content directly with Python code. See this answer or this answer for more details.

Andy
  • 13,916
  • 1
  • 36
  • 78
  • 1
    This only answers half of the question. If the python script uses a conda virtual environment, is there anything different that needs to happen? If you are using python's venv, is there anything else you have to do other than point the 'Command Path' at the venv version of python? – David Jun 14 '22 at 23:41
0

Working case.

  • I have venv environment previously created through code editor in the C:\python_nifi_env\.venv folder
  • I have python file with code which imports pandas library named doing_pandas_thing.py in the same C:\python_nifi_env directory
  • In my NiFi UI I created "ExecuteStreamCommand" executor, where
    • Command Path: c:\python_nifi_env\.venv\Scripts\python.exe
    • Command Arguments: C:\python_nifi_env\doing_pandas_thing.py

And that's all.

# doing_pandas_thing.py contents

import numpy as np
import pandas as pd


df = pd.DataFrame(np.random.normal(0, 0.1, 1000), columns=['number'])
df.to_csv("C:\\python_nifi_env\\output\\result.csv")
Eric Aya
  • 69,473
  • 35
  • 181
  • 253
Andrey Morozov
  • 7,839
  • 5
  • 53
  • 75