I want to import data use sqoop, but I don't want to use shell command. So how to use Java API to do this.The Sqoop version is 1.4.6, and I use Scala + SBT to do it. by the way, which dependencies I need ?
Asked
Active
Viewed 1,949 times
1 Answers
1
I was needing to use Sqoop to import data from MySQL to Hive using Scala inside a Cloudera CDH 5.7 cluster, so I started by following this answer.
The problem was that it was not getting the right configurations when it was being executed on the server.
Executing Sqoop manually was something like this:
sqoop import --hive-import --connect "jdbc:mysql://host/db" \
--username "username" --password "password" --table "viewName" \
--hive-table "outputTable" -m 1 --check-column "dateColumnName" \
--last-value "lastMinDate" --incremental append
So at the end I chose to execute it as an external process using Scala's sys.process.ProcessBuilder
. It does not require any SBT dependency for running this way. Finally the runner was implemented this way:
import sys.process._
def executeSqoop(connectionString: String, username: String, password: String,
viewName: String, outputTable: String,
dateColumnName: String, lastMinDate: String) = {
// To print every single line the process is writing into stdout and stderr respectively
val sqoopLogger = ProcessLogger(
normalLine => log.debug(normalLine),
errorLine => errorLine match {
case line if line.contains("ERROR") => log.error(line)
case line if line.contains("WARN") => log.warning(line)
case line if line.contains("INFO") => log.info(line)
case line => log.debug(line)
}
)
// Create Sqoop command, every parameter and value must be a separated String into the Seq
val command = Seq("sqoop", "import", "--hive-import",
"--connect", connectionString,
"--username", username,
"--password", password,
"--table", viewName,
"--hive-table", outputTable,
"-m", "1",
"--check-column", dateColumnName,
"--last-value", lastMinDate,
"--incremental", "append")
// result will contain the exit code of the command
val result = command ! sqoopLogger
if (result != 0) {
log.error("The Sqoop process did not finished successfully")
} else {
log.info("The Sqoop process finished successfully")
}
}

Community
- 1
- 1

Camilo Sampedro
- 1,306
- 1
- 19
- 32