3

I have a requirement to query a column in Hive and based on the output I want to query it in MySQL DB.

The flow is something like below:

  1. Query Hive table, get the list of IDs
  2. Use the list of ID's to query MySQL table. The query would be like select a,b,c from x where id in (list from hive)

How do I do this?

ydaetskcoR
  • 53,225
  • 8
  • 158
  • 177
Rakesh Shrama
  • 249
  • 1
  • 5
  • 16

1 Answers1

2

You just need to make your query using a tHiveInput component to retrieve the Id's and then you've got two options, the easy way or the hard (and better) way:

For the easy way you can then run from the tHiveInput component to a tFlowToIterate and then iterate out to a tMySqlInput component. Your query in the tMySqlInput component would then look something like:

"SELECT a, b, c
FROM x
WHERE id = '" + ((Integer)globalMap.get("row1.id")) + "'"

row1 here represents the row running into your tFlowToIterate. You can also get this variable by hitting ctrl+space and selecting the tFlowToIterate_x.id option.

You could then connect a tBufferOutput component to your tMySqlInput component to collect all of the iterations and then read it back in with a tBufferInput component for further processing.

This is obviously a simple concatenation that is then not the best way for efficiency and of course open to SQL injection. But it's a rough and ready way.

Your other (harder but potentially better) option is to use a parameterised query using a tMySqlRow component (not currently supported in the tMySqlInput component as of Talend 5.4).

This answer goes into some detail about parameterised queries in Talend as part of a solution to another problem (upserting efficiently). As mentioned in that answer, there's more detail on parameterised queries on Gabriele Baldassarre's blog.

Community
  • 1
  • 1
ydaetskcoR
  • 53,225
  • 8
  • 158
  • 177
  • Thank you Sir. But I found a better way, we can use something like group_concat to get the inlist value as a single value. Assign the value to a parameter and then use the context parameter in the mysql query. :) – Rakesh Shrama Aug 06 '14 at 18:32