I'm using Hive to process my CSV files. I've stored CSV files in HDFS and wanna create tables from those files.
I use the following command:
create external table if not exists csv_table (dummy STRING)
ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' LINES TERMINATED BY '\n'
STORED AS TEXTFILE
LOCATION 'hdfs://localhost:9000/user/hive'
TBLPROPERTIES ("skip.header.line.count"="1");
LOAD DATA INPATH '/CsvData/csv_table.csv' OVERWRITE INTO TABLE csv_table;
So the file under /CsvData
will be moved into /user/hive
. It makes sense.
But how if I want to create another table?
create external table if not exists csv_table2 (dummy STRING)
ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' LINES TERMINATED BY '\n'
STORED AS TEXTFILE
LOCATION 'hdfs://localhost:9000/user/hive'
TBLPROPERTIES ("skip.header.line.count"="1");
LOAD DATA INPATH '/CsvData/csv_table2.csv' OVERWRITE INTO TABLE csv_table2;
It will raise an exception complaining that the directory is not empty.
ERROR : FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask. Directory hdfs://localhost:9000/user/hive could not be cleaned up.
So it is hard for me to understand, does it mean I can store only one file understand one directory? To store multiple files I have to create one directory for every file?
Is it possible to store all the files together?