Questions tagged [sqoop]

Sqoop is an open source connectivity framework that facilitates transfer between multiple Relational Database Management Systems (RDBMS) and HDFS. Sqoop uses MapReduce programs to import and export data; the imports and exports are performed in parallel.

Sqoop is an open source connectivity framework that facilitates transfer between multiple Relational Database Management Systems (RDBMS) and HDFS. Sqoop uses MapReduce programs to import and export data; the imports and exports are performed in parallel.

You can use Sqoop to import data from a relational database management system (RDBMS) such as MySQL or Oracle into the Hadoop Distributed File System (HDFS), transform the data in Hadoop MapReduce, and then export the data back into an RDBMS.

Available Sqoop commands:

  codegen            Generate code to interact with database records
  create-hive-table  Import a table definition into Hive
  eval               Evaluate a SQL statement and display the results
  export             Export an HDFS directory to a database table
  help               List available commands
  import             Import a table from a database to HDFS
  import-all-tables  Import tables from a database to HDFS
  import-mainframe   Import mainframe datasets to HDFS
  list-databases     List available databases on a server
  list-tables        List available tables in a database
  version            Display version information

Sqoop has been a Top-Level Apache project since March of 2012.

References

Related Tags

2610 questions
27
votes
6 answers

How to copy data from one HDFS to another HDFS?

I have two HDFS setup and want to copy (not migrate or move) some tables from HDFS1 to HDFS2. How to copy data from one HDFS to another HDFS? Is it possible via Sqoop or other command line?
sharp
  • 2,140
  • 9
  • 43
  • 80
22
votes
5 answers

How to use Sqoop in Java Program?

I know how to use sqoop through command line. But dont know how to call sqoop command using java programs . Can anyone give some code view?
Pradeep Bhadani
  • 4,435
  • 6
  • 29
  • 48
22
votes
7 answers

What's the difference between Flume and Sqoop?

Both Flume and Sqoop are meant for data movement, then what is the difference between them? Under what condition should I use Flume or Sqoop?
Cacheing
  • 3,431
  • 20
  • 46
  • 65
20
votes
6 answers

Sqoop import having SQL query with where clause

sqoop import --connect jdbc:teradata://192.168.xx.xx/DBS_PORT=1025,DATABASE=ds_tbl_db --driver com.teradata.jdbc.TeraDriver --username dbc --password dbc --query 'select * from reason where id>20' --hive-import --hive-table reason_hive…
Dev
  • 13,492
  • 19
  • 81
  • 174
16
votes
7 answers

Sqoop Import --password-file function not working properly in sqoop 1.4.4

I am using hadoop-1.2.1 and sqoop version is 1.4.4. I am trying to run the following query. sqoop import --connect jdbc:mysql://IP:3306/database_name --table clients --target-dir /data/clients --username root --password-file /sqoop.password -m…
Kanav Narula
  • 337
  • 1
  • 2
  • 11
15
votes
5 answers

what are the following commands in sqoop?

Can anyone tell me what is the use of --split-by and boundary query in sqoop? sqoop import --connect jdbc:mysql://localhost/my --username user --password 1234 --query 'select * from table where id=5 AND $CONDITIONS' --split-by table.id --target-dir…
NJ_315
  • 1,863
  • 7
  • 22
  • 30
14
votes
2 answers

PostgreSQL - FATAL: Ident authentication failed for user

I've created a simple table in postgres called employees in database mytestdb I would like to import this table into hdfs. bin/sqoop import --connect 'jdbc:postgresql://127.0.0.1/mytestdb' --username user -P --table employees --target-dir…
usr
  • 782
  • 1
  • 7
  • 25
14
votes
7 answers

Apache Spark-SQL vs Sqoop benchmarking while transferring data from RDBMS to hdfs

I am working on a use case where I have to transfer data from RDBMS to HDFS. We have done the benchmarking of this case using sqoop and found out that we are able to transfer around 20GB data in 6-7 Mins. Where as when I try the same with Spark…
Amitabh Ranjan
  • 1,500
  • 3
  • 23
  • 39
12
votes
1 answer

Import data from HDFS to HBase (cdh3u2)

I have Installed hadoop and hbase cdh3u2. In hadoop i have a file at the path /home/file.txt. it has the data like one,1 two,2 three,3 I want to import this file into hbase. in that, the first field should parsed as String, and 2nd field parsed as…
Nageswaran
  • 7,481
  • 14
  • 55
  • 74
11
votes
6 answers

Difference between --warehouse-dir and --target-dir commands in sqoop

I could not understand the difference between the following commands in sqoop. It would be better if someone could explain with small examples. --warehouse-dir and --target-dir Thanks
sree
  • 1,870
  • 1
  • 21
  • 36
11
votes
2 answers

How do I access HBase table in Hive & vice-versa?

As a developer, I've created HBase table for our project by importing data from existing MySQL table using sqoop job. The problem is our data analyst team are familiar with MySQL syntax, implies they can query HIVE table easily. For them, I need to…
Abhishek
  • 6,912
  • 14
  • 59
  • 85
11
votes
8 answers

Sqoop Incremental Import

Need advice on Sqoop Incremental Imports. Say I have a Customer with Policy 1 on Day 1 and I imported those records in HDFS on Day 1 and I see them in Part Files. On Day 2, the same customer adds Policy 2 and after the incremental import sqoop run,…
user3501743
  • 141
  • 1
  • 1
  • 7
10
votes
1 answer

Sqoop - Import Job failed

I am trying to import a table of 32 Million records from SQL Server to Hive via Sqoop. The connection is SQL Server is successful. But Map/Reduce job does not successfully execute. It gives the following error: 18/07/19 04:00:11 INFO client.RMProxy:…
user1584253
  • 975
  • 2
  • 18
  • 55
10
votes
5 answers

Sqoop import without primary key in RDBMS

Can I import RDBMS table data (table doesn't have a primary key) to hive using sqoop? If yes, then can you please give the sqoop import command. I have tried with sqoop import general command, but it failed.
KM Prak
  • 127
  • 1
  • 1
  • 6
10
votes
4 answers

SQOOP SQLSERVER Failed to load driver " appropriate connection manager is not being set"

I downloaded sqljdbc4.jar. I'm invoking sqoop like so from the folder (where the jar is stored): sqoop list-tables --driver com.microsoft.jdbc.sqlserver.SQLServerDriver --connect jdbc:sqlserver://localhost:1433;user=me;password=myPassword;…
hba
  • 7,406
  • 10
  • 63
  • 105
1
2 3
99 100