0

Pentaho ETL. , how do I specify schema name in DB connection? there is no place to type schema names. I need it as I am running ETL against multiple schemas located on one db. enter image description here

I need to collect data from multiple schemas in one DB, and I need to know how to specify them in DB connector so later I can replace them with variables so I can change connections dynamically.

Andrey
  • 1,629
  • 13
  • 37
  • 65
  • It's not part of the connection settings since you can use the same connection to the same database to access all schemas within that database. – Bergi Jul 06 '22 at 00:40
  • I need to collect data from multiple schemas in one DB, and I need to know how to specify them in DB connector so later I can replace it with variables so I can change connections dynamically. – Andrey Jul 06 '22 at 00:50
  • You don't specify them in the connector. You specify the schema in the query itself. – Bergi Jul 06 '22 at 00:58
  • I cant as my objects inside of procedures don't do reference to schema. for example, I have 3 schema, I need to run it only on schema 1 , I cant rewrite all my code it's 100+ K. lines of legacy code usually I am setting working schema. I want to use the same technique here, and also be able to pass schema name into connection so it will run only on specific schema – Andrey Jul 06 '22 at 01:20
  • Using JDBC [appears to allow specifying a `currentSchema` in a connection string](https://stackoverflow.com/q/4168689/1048572), but I have no idea how to put such options into pentaho. The alternative is running `SET search_path = …` before your queries, or altering the `search_path` for the user that you use for connecting. – Bergi Jul 06 '22 at 02:42
  • I am having the same issue with SET search_path = … as for some reason, it cannot assign value to variable from select result with Copy rows to result (table imput(select all needed schemas from DB) ->Copy rows to result - > and on next transformation script it must assign to loval varibles but. it doesnt ) – Andrey Jul 06 '22 at 03:35

1 Answers1

1

Using the currentSchema variable added to the connection string suggested by @Bergi, you can add this option in the Options tab of the database connection:

Options tab in Database Connection

But I don't know if you are going to be able to use variables afterwards to change this dynamically. In my version (8.2) of PDI you can't, maybe in more modern versions you are going to be able, or they have added metadata injection capabilities to the DB connection (although I doubt that).

You might be able to create different connections for each schema and pass the connection name using a variable or using metadata injection, that could be another solution.

Ana GH
  • 1,397
  • 1
  • 9
  • 19
  • not really what i need, as i have multiple schemas and i cannot hardcode it, i need to pass it from the select. – Andrey Jul 08 '22 at 04:16
  • Can't you use metadata injection passing the query and the name of the DB Connection as metadata? – Ana GH Jul 08 '22 at 10:55