0

i am planning to write python code in executescript processor. As this is my first time, i am finding difficult to start.

basically, i want to read flowfile (csv) and do some manipulation and write it to flowfile.

is there a way where we can write the code beforehand, say suppose jupyter and then replicate the same in the processor?

also, is there any syntax documentation for writing the code?

EXECUTESTREAMCOMMAND:

import org.apache.commons.io.IOUtils
import java.io
import csv 

# Get flowFile Session
flowFile = session.get()

# Open data.json file and parse json values
readFile = csv.reader(sys.stdin)
for row in readFile:
    new_value = row[0]
if (flowFile != None):
    flowFile = session.putAttribute(flowFile, "from_python_string", "python string example")
    flowFile = session.putAttribute(flowFile, "from_python_number", str(new_value))

session.transfer(flowFile, REL_SUCCESS)
session.commit()

Command Arguments: C:\Users\Desktop\samp1.py
Command Path: C:\Users\AppData\Local\Programs\Python\Python37-32\python

when i execute it, it throws error on the import statement saying no module found.

tia

natarajan k
  • 406
  • 9
  • 24
  • nice post to start with: https://community.hortonworks.com/articles/75032/executescript-cookbook-part-1.html – daggett Jun 03 '19 at 09:08
  • i am using executestreamcommand and facing the error as i have updated in my question. please help me – natarajan k Jun 03 '19 at 10:03
  • 1
    if you are going to use executestreamcommand then you don't need to work with flowfiles. you will get flow file content as stdin and transformed content you have to write into stdout. so, you'll be able to debug your python script without nifi. – daggett Jun 03 '19 at 10:13
  • thanks @daggett. my python script is working with stdin and stdout. is there anyway to get the attribute values and update new attribute through python script in executestreamcommand processor? – natarajan k Jun 05 '19 at 05:23

1 Answers1

1

Matt Burgess wrote a script tester tool which can accept a Jython script and test it. Not quite the interactive environment you're looking for, but probably as close as exists out of the box.

The code you write when using ExecuteScript and ExecuteStreamCommand will be very different; the core logic may be the same, but the way your code accesses and generates flowfile attributes and content will differ because when run outside of the NiFi runtime, Python has no awareness of the NiFi-specific features. See this answer for more details on how to write for ExecuteStreamCommand and this answer for ExecuteScript.

Andy
  • 13,916
  • 1
  • 36
  • 78
  • thanks @Andy. my python script in ExecuteStreamCommand processor is working with stdin and stdout. is there anyway to get the attribute values and update new attribute? – natarajan k Jun 05 '19 at 05:24
  • You can pass attributes in as arguments, but there is currently no easy way to update an existing attribute with output. – Andy Jun 05 '19 at 05:41
  • thanks @Andy. final question, can we add new attributes? – natarajan k Jun 05 '19 at 05:51
  • You can use an `UpdateAttribute` processor or other processor like `EvaluateJSONPath` to update attributes later in the flow. – Andy Jun 05 '19 at 05:52