I'm trying to pass in an array to a Hive UDF
via collect_set
:
SELECT ..., collect_set(...) FROM ...;
And my Hive UDF
wants to take in this array and append the first letter of each array element to an output string:
public class MyUDF extends UDF {
public String evaluate(String[] array) {
String output = "";
// Check for valid argument
if (array == null) return output;
try {
// Add first character of every array element to output string
for (int i = 0; i < array.length; i++) {
output += array[i].charAt(0);
// If there is another array element after this one, append DELIMITER
if (i + 1 < array.length) output += ",";
}
} catch (Exception e) {
System.out.println(e.getMessage());
System.exit(1);
}
return output;
}
But the issue I get when I try to run:
ADD JAR ./list_builder.jar;
CREATE TEMPORARY FUNCTION build_list as 'MyCustomUDF.MyUDF';
SELECT ..., build_list(collect_set(description)) FROM ...;
...
FAILED: SemanticException [Error 10014]: Line 142:21 Wrong arguments 'description': No matching method for class MyCustomUDF.MyUDF with (array<string>). Possible choices: _FUNC_(struct<>)
I've tried changing String[]
to ArrayList
and List
but I'm still hitting the same error.
Note: The output of collect_set
is something like: [L-ADD", "P-OAN", "P-OAH"]
, so I'm expecting an output from my UDF like: L,P,P
.
Any ideas?
Thanks.