I would like to know how can I read the Hive resources that is added using ADD FILE
from Udf?
e.g.
Hive > add file /users/temp/key.jks
Is it possible to read this file in UDF in Java? What will be the path to get this file in Udf?
Thanks David
I would like to know how can I read the Hive resources that is added using ADD FILE
from Udf?
e.g.
Hive > add file /users/temp/key.jks
Is it possible to read this file in UDF in Java? What will be the path to get this file in Udf?
Thanks David
Once a resource is added to a session using ADD
command, Hive queries can refer to it by its name (in map/reduce/transform clauses) and the resource is available locally at execution time on the entire Hadoop cluster. Hive uses Hadoop's Distributed Cache to distribute the added resources to all the machines in the cluster at query execution time. See here: HiveResources
There is the in_file(string str, string filename)
function in Hive - returns true if the string str appears as an entire line in filename. You can use in_file source code as an example: GenericUDFInFile.java
Few methods from the source code:
private BufferedReader getReaderFor(String filePath) throws HiveException {
try {
Path fullFilePath = FileSystems.getDefault().getPath(filePath);
Path fileName = fullFilePath.getFileName();
if (Files.exists(fileName)) {
return Files.newBufferedReader(fileName, Charset.defaultCharset());
}
else
if (Files.exists(fullFilePath)) {
return Files.newBufferedReader(fullFilePath, Charset.defaultCharset());
}
else {
throw new HiveException("Could not find \"" + fileName + "\" or \"" + fullFilePath + "\" in IN_FILE() UDF.");
}
}
catch(IOException exception) {
throw new HiveException(exception);
}
}
private void loadFromFile(String filePath) throws HiveException {
set = new HashSet<String>();
BufferedReader reader = getReaderFor(filePath);
try {
String line;
while((line = reader.readLine()) != null) {
set.add(line);
}
} catch (Exception e) {
throw new HiveException(e);
}
finally {
IOUtils.closeStream(reader);
}
}