Using Pentaho PDI 6, with:
A) CSV Input on .csv (4 row .csv from IBM), with ASCII file encoding (preview rows works fine)
connected to
B) CPython Script Executor, installable from Tools -> MarketPlace. Assumes Python, Pandas, Numpy installed. Script settings:
Configure, Input Frames: (previous step), df
Python Script, Manual Python Script: df.replace(to_replace= "\[|\]|'|\"", value='', regex=True, inplace=True)
Output Fields, Output Fields: (column names, string type)
throws
2016/07/25 10:45:21 - CPython Script Executor.0 - ERROR (version 6.1.0.1-196, build 1 from 2016-04-07 12.08.49 by buildguy) : Unexpected error
2016/07/25 10:45:21 - CPython Script Executor.0 - ERROR (version 6.1.0.1-196, build 1 from 2016-04-07 12.08.49 by buildguy) : java.lang.NullPointerException
2016/07/25 10:45:21 - CPython Script Executor.0 - at org.pentaho.python.PythonSession.rowsToPythonDataFrame(PythonSession.java:389)
2016/07/25 10:45:21 - CPython Script Executor.0 - at org.pentaho.di.trans.steps.cpythonscriptexecutor.CPythonScriptExecutor.rowsToPyDataFrame(CPythonScriptExecutor.java:458)
2016/07/25 10:45:21 - CPython Script Executor.0 - at org.pentaho.di.trans.steps.cpythonscriptexecutor.CPythonScriptExecutor.processBatch(CPythonScriptExecutor.java:276)
2016/07/25 10:45:21 - CPython Script Executor.0 - at org.pentaho.di.trans.steps.cpythonscriptexecutor.CPythonScriptExecutor.processRow(CPythonScriptExecutor.java:243)
2016/07/25 10:45:21 - CPython Script Executor.0 - at org.pentaho.di.trans.step.RunThread.run(RunThread.java:62)
2016/07/25 10:45:21 - CPython Script Executor.0 - at java.lang.Thread.run(Unknown Source)
2016/07/25 10:45:21 - CPython Script Executor.0 - Finished processing (I=0, O=0, R=3, W=0, U=0, E=1)
Previous debugging suggests that processRow might not be able to determine the metadata type but this error doesn't indicate this.
Question: What's the proper way to set up a scripting task to read in .csv w/o throwing NullPointerExceptions?
EDIT - The error is reproduced with the source materials as well. See: Mark Hall, Cpython Scripting and the example .zip file
EDIT 1 - python
in the Command Prompt gives
C:\Users\*****>python
Python 3.5.2 (v3.5.2:4def2a2901a5, Jun 25 2016, 22:01:18) [MSC v.1900 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
I am not running anaconda (too heavy weight) and my version of Python is .1 ahead, which might be impacting things but I would hope the plugin was python version agnostic unless the Python binary programming interface changed or something.
EDIT 2 - I can't attach the Kettle file but the example files from Mark Hall above reproduce the same issue I encountered.