Questions tagged [pdi]

PDI Pentaho’s Data Integration, also known as Kettle, provides extraction, transformation, and loading (ETL) capabilities.

PDI (Pentaho Data Integration), formally known as Kettle, is a project of data integration. It delivers powerful Extraction, Transformation, and Loading (ETL) capabilities, using a groundbreaking, metadata-driven approach.

External Links:

440 questions
6
votes
4 answers

How to get the reorder the column with csv input fixed column in pentaho

Scenario: I have created transformation to load data into table from csv file and I have following columns in csv file: Customer_Id Company_Id Employee_Name But user may give input file with column ordering (random order) as…
yuvi
  • 564
  • 5
  • 12
6
votes
3 answers

Newly inserted or updated row count in pentaho data integration

I am new to Pentaho Data Integration; I need to integrate one database to another location as ETL Job. I want to count the number of insert/updat during the ETL job, and insert that count to another table . Can anyone help me on this?
Sreejith
  • 587
  • 1
  • 9
  • 18
5
votes
2 answers

"Table exists" step in Pentaho Kettle

I want to use "Table exists" step to check if certain table exists if not then create one The transformation I have created (in order to copy data from input database into output database) Table Input -----> Table exists ----> Table output The…
Hello lad
  • 17,344
  • 46
  • 127
  • 200
4
votes
1 answer

Pentaho DI - JSON Nested File Output

I have a requirement where I need to fetch records from multiple tables. The primary table is having one-to-many relationship to other tables. My data source is Oracle DB. Oracle db is having the specified tables. One called Student other one is…
4
votes
1 answer

Remove special characters using Pentaho - Replace in String

I wanted to remove the special characters like ! @ # $ % ^ * _ = + | \ } { [ ] : ; < > ? / in a string field. I used the "Replace in String" step and enabled the use RegEx. However, I do not know the right syntax that I will put in "Search" to…
M. Loyyy
  • 45
  • 1
  • 1
  • 4
4
votes
1 answer

Pentaho DI can't connect to AWS Redshift - Amazon Error 100021

Referring to Pentaho's Doc, we should be using RedshiftJDBC4.jar instead of version 4.1. I have downloaded the driver and placed it in the lib/ directory. Relaunched spoon.sh and I noticed it is no longer complaining about not able to find the…
4
votes
1 answer

Pentaho ETL : Database Join vs Table Input

I need to write a database table data to a text file with some transformation. There are two steps available to retrieve the data from the table, namely Table input and Database join. I don't see much difference between them except the "outer join?"…
Jeet
  • 1,006
  • 1
  • 14
  • 25
4
votes
2 answers

Get rows from result step and Get Varaibles usage in Pentaho data Integeration

can any one provide example for both get variables and get rows from result step in pentaho data integration. I have a job with two transformations. First transformation take sample input and genrate sample output and a the end i have copy rows to…
syed tabrez
  • 59
  • 1
  • 5
4
votes
2 answers

How to remove column in Pentaho Data Integration?

I am using PDI/Kettle. I know it is possible to add new columns by specifying them in fields. Is it possible to remove deprecated input columns coming from the previous step in Modified Javascript Step with Spoon?
Hello lad
  • 17,344
  • 46
  • 127
  • 200
3
votes
1 answer

How to concatenate string array field to show as one string if its an array

Following are two documents in my collection: { "_id": { "$oid": "5f48e358d43721376c397f54" }, "heading": "this is heading", "tags": ["tag1","tag2","tag3"], "categories": ["projA", "projectA2"], "content": ["This",…
ghengalala
  • 109
  • 9
3
votes
1 answer

PDI Kettle - How to specify ObjectId for query match in MongoDB Output

Using PDI Kettle MongoDB Output, I am trying to update a mongodb document, by querying the _id (ObjectId) field. If i pass the _id variable as String to the MongoDB Output step, the final query that gets created looks like Modifier update…
Mahesh
  • 123
  • 2
  • 2
  • 7
3
votes
1 answer

Pentaho Execute SQL Statements variable conversion to null

I am using PDI to delete and insert some data from a DB. I have the following issue. I create two variables called START_DATE and END_DATE that are used to select the data that will be deleted from my DB. I am able to get them and run my…
Diego Serrano
  • 846
  • 2
  • 15
  • 34
3
votes
1 answer

using variable names for a database connection in Pentaho Kettle

I am working on PDI kettle. Can we define a variable and use it in a database connection name. So that if in future if i need to change the connections in multiple transformations i would just change the variable value in kettle properties file?
Akn
  • 35
  • 6
3
votes
1 answer

Excel writer date format error

I have a source data(.csv) with a Date column which a format "dd/mm/yyyy" and when I try to output this date column into Excel writer, it gives me an error and also Excel writer step doesn't has the same format built-in in the content…
Deepesh
  • 820
  • 1
  • 14
  • 32
3
votes
2 answers

How to use trust store with pentaho Data Integration / Rest Client?

I'm using Pentaho Data Integration (Kettle). My goal is to consume an existing REST API with HTTPS. To achieve this, I use the REST Client provided by pdi. On my local environment, I'm able to consume this API. However, once I push it on the…
nan0
  • 31
  • 1
  • 1
  • 7
1
2 3
29 30