Questions tagged [kettle]

Kettle is a code name for Pentaho Data Integration Community Edition tool. It is an open source GUI-based ETL (Extraction, Transformation, and Loading) tool.

Kettle is a code name for Pentaho Data Integration Community Edition. It is an ETL tool (Extraction, Transformation and Loading) that uses a metadata-driven approach.

https://help.pentaho.com/Documentation/8.2/Products/Data_Integration

1387 questions
22
votes
1 answer

Using Pentaho Kettle, how do I load multiple tables from a single table while keeping referential integrity?

Need to load data from a single file with a 100,000+ records into multiple tables on MySQL maintaining the relationships defined in the file/tables; meaning the relationships already match. The solution should work on the latest version of MySQL,…
blunders
  • 3,619
  • 10
  • 43
  • 65
20
votes
2 answers

Using Pentaho Kettle, how do I automatically retry rest requests which fail due to connection hiccups?

How can we make Pentaho retry rest requests on connection errors? We have a Pentaho BI system which, among numerous data sources, is querying a particular REST api for over 20k query variations each run. Predicatably, on most runs a few of these…
ms-tg
  • 2,688
  • 23
  • 18
15
votes
4 answers

Kettle / Pentaho Data Integration - unable to create a Database Connection (XulException: java.lang.reflect.InvocationTargetException)

Having finally got kettle to start and not hang, I still cannot use it to much avail, as when I try to create a new Database Connection (after creating a new Transformation) I get this error: org.pentaho.ui.xul.XulException:…
Blew my stack
  • 261
  • 1
  • 4
  • 9
15
votes
2 answers

Pass DB Connection parameters to a Kettle a.k.a PDI table Input step dynamically from Excel

I have a requirement such that whenever i run my Kettle job, the database connection parameters must be taken dynamically from an excel source on each run. Say i have an excel with column names : HostName, Username, Database, Password. i want to…
Ritesh
  • 237
  • 1
  • 4
  • 13
14
votes
3 answers

Pentaho kettle: how to set up tests for transformations/jobs?

I've been using Pentaho Kettle for quite a while and previously the transformations and jobs i've made (using spoon) have been quite simple load from db, rename etc, input to stuff to another db. But now i've been doing transformations that do a bit…
hannesh
  • 502
  • 10
  • 15
11
votes
5 answers

Maven Dependency for PDI(Pentaho Kettle) Jar files

I have written Java code to execute my transformation and Jobs and I have manually added all the Jar files present in the data-integration/lib folder to my class path and evrything is working fine. Now I want to mavenize my project and looking for…
sun_dare
  • 1,146
  • 2
  • 13
  • 33
11
votes
2 answers

Rhino ETL opinions vs Kettle and SSIS

I am considering a tool for an ETL solution that has high daily demand and requires heavy business logic processing. I've tried kettle and SSIS so far, and also want to test for Rhino ETL. I don't care for the visual flow structure of both Kettle…
Pedro
  • 11,514
  • 5
  • 27
  • 40
9
votes
2 answers

Use JSON Input step to process uneven data

I'm trying to process the following with an JSON Input step: {"address":[ {"AddressId":"1_1","Street":"A Street"}, {"AddressId":"1_101","Street":"Another Street"}, {"AddressId":"1_102","Street":"One more street", "Locality":"Buenos Aires"}, …
rsilva4
  • 1,915
  • 1
  • 23
  • 39
9
votes
1 answer

Pentaho Kettle Java 11 Roadmap

Currently Pentaho Kettle (v.9.1) officially only support Java 8. This is a problem for us, since we are maintaining a plugin that needs Java 11 because of a essential library that needs Java 11. Does anyone have details on the roadmap for the…
9
votes
4 answers

Duplicating a job in Pentaho Data Integration for different connections

I've generated a job via the Copy Tables wizard in Spoon UI, that copies some tables from an oracle database source to an SQL Server one, and made some changes to the job as well. Now I want to duplicate the same job (same tables and same…
mounaim
  • 1,132
  • 7
  • 29
  • 56
8
votes
2 answers

What is the difference between a Job and a Transformation?

When creating new objects in spoon there's two possibilities: Job and Transfromation. They've got a different set of possible components (although with some level of overlap) and the XML that is generated looks very similar. What's the difference…
nelsonda
  • 1,170
  • 1
  • 10
  • 21
8
votes
3 answers

Pentaho: How to dynamically add Field (= Column) to OutputRow?

I would like to dynamically add fields (or a new columns) to the resulting output row in Kettle. After spending hours reading through froum posts and he not so well done scripting documentation, I wondered if Stackoverflow would be of any help.
chris polzer
  • 3,219
  • 3
  • 28
  • 44
7
votes
2 answers

Pentaho-kettle: Need to create ETL Jobs dynamically based on user input

In my application, user can specify the format of their file. Based on user input we dynamically create SSIS package. http://lakshmik.blogspot.com/2005/05...eate-ssis.html Dynamically created SSIS package is used for processing user's files. We…
Arnav
  • 1,008
  • 1
  • 10
  • 12
7
votes
5 answers

Unable to start blueprint container for bundle pdi-dataservice-server-plugin due to unresolved dependencies

I am using windows batch file to call a Pentaho Data integration job, intermittently, the job gets hung indefinitely. The error message in Pentaho logs is as below : 06:43:37,951 ERROR [BlueprintContainerImpl] Unable to start blueprint container…
Sarang Manjrekar
  • 1,839
  • 5
  • 31
  • 61
7
votes
2 answers

Pentaho Data Integration: Error Handling

I'm building out an ETL process with Pentaho Data Integration (CE) and I'm trying to operationalize my Transformations and Jobs so that they'll be able to be monitored. Specifically, I want to be able to catch any errors and then send them to an…
jonnysamps
  • 1,067
  • 1
  • 14
  • 20
1
2 3
92 93