exporting Hive table to csv in hdfs

Question

I know there is a known issue with delimiters when saving a table to csv (or other text file) in Hive. So I'm wondering if you guys can help me get around that.

I have an existing table (Table A) and I would like to save it in csv format to hdfs. From reading other responses I believe I would have to first create an external table (but I'm not sure how the whole thing would look).

Can anyone help?

Personally, I'd just use `insert overwrite directory` and have a `\001` delimited file. But if you *must* have a csv, I think you're best off piping the query to a local .csv and then `hadoop fs -put`-ing it into hdfs, which is described here http://stackoverflow.com/questions/18129581/how-do-i-output-the-results-of-a-hiveql-query-to-csv — o-90, May 14 '15 at 03:14
I want to avoid saving such a large file to the local directory if possible. My issue is that I then need to load the file into R and to do this you normally specify a delimiter, not sure how this would work (or if it would work) with an invisible character. — Laura, May 14 '15 at 14:10
I load data from hdfs to spark and specify the delimiter as `"\001"`. I've never had (a reason) to import a file from HDFS to R but I'd assume it can take any delimiter as long as it is specified correctly. — o-90, May 15 '15 at 03:02

score 12 · Answer 1 · answered May 14 '15 at 03:35

12

Try this in hive shell:

INSERT OVERWRITE LOCAL DIRECTORY '/path/to/hive/csv' ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' SELECT * FROM hivetablename;

Change your /path/to/csv to the location where you want to store csv file. hivetablename to your hive table to be stored in csv format.

answered May 14 '15 at 03:35

Rajesh N

2,554
1
13
17

3

I think that she wants it saved on hdfs, not on the local fs. – Kim Moritz Dec 01 '17 at 07:08
2

@KimMoritz In that case, excluding `LOCAL` and specifying the HDFS path should do the trick. – Andrea Jul 13 '18 at 09:19
i got this error `Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Failed to create local target directory /export` – didi Apr 14 '21 at 07:31

Hendrik F · Answer 2 · 2017-11-24T16:27:50.687

This three step process worked fine for me:

in HIVE, create a new temporary table stored as textfile

CREATE TABLE temp_table( id INT, name STRING)
ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
STORED AS TEXTFILE;

also in HIVE, write the source data into the temporary table

INSERT OVERWRITE TABLE temp_table
SELECT id, name FROM source_table;

From the command line, copy the Hive table from its location in HDFS to you local file - inspection should reveal the file as csv (rename accordingly if desired)
```
hdfs dfs -copyToLocal /apps/hive/warehouse/temp_table/* /tmp/local_dir/
```

If you run the command SHOW CREATE TABLE temp_table in HIVE, it will tell you the exact location of the table in HDFS, e.g.

| LOCATION                                                        |
|   'hdfs://hadoop_cluster/apps/hive/warehouse/temp_table'        |

score 1 · Answer 3 · answered May 14 '15 at 12:08

1

For external table in hive, you can follow the below steps:

Create external table in hive

CREATE EXTERNAL TABLE external_table( number INT, name STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' LOCATION '/user/hive/external/mytable/';

2. Load the data file from local to HDFS location

hadoop fs -put /home/user1/Desktop/filename.csv /user/hive/external/mytable/

The above two steps can solve your problem.

answered May 14 '15 at 12:08

Farooque

3,616
2
29
41

where am I inputting the existing table that I have created? I see this as creating a blank table and putting it in the local directory. Can you explain more? – Laura May 14 '15 at 18:26
2

LOCATION '/user/hive/external/mytable/'; the location value is the HDFS location directory and NOT local directory. while moving the file i.e. hadoop fs -put /home/user1/Desktop/filename.csv /user/hive/external/mytable/ , it is from local directory to HDFS location directory – Farooque May 14 '15 at 23:28
my table is already stored in hdfs and can be accessed with a /user/hive... path – Laura May 18 '15 at 16:20
If you can access the csv file data using select query and open the csv file using hadoop fs -cat 'file-path' correctly, then now move the csv file to the required location where you want. – Farooque May 18 '15 at 16:33

score 0 · Answer 4 · answered Sep 11 '19 at 17:41

3 steps you may follow:-

Create a temporary table

DROP TABLE IF EXISTS Test; CREATE TABLE Test ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' LINES TERMINATED BY '\n' AS SELECT * FROM ABC #OR whole query statements
Make a CSV from created hive table

hive -e 'set hive.cli.print.header=true; select * from Test' | sed 's/[\t]/,/g' > /home/hadoop/Mycsv.csv
Put the created CSV file to S3 of whatever location you want

hdfs dfs -put -f /home/hadoop/Mycsv.csv 's3://bucketName/'

exporting Hive table to csv in hdfs

4 Answers4

Linked