0

According to the documentation, when we create an EXTERNAL table in HIVE, and then DROP the table, the metadata is updated and the data that was loaded in the HDFS directory /user/hive/warehouse//> still exists?

I have two questions : 1. How do you do clean-up of the files in the /user/hive/warehouse//>? 2. When I tried to create the table again and the files are the same name but the data is different, HIVE warehoouse files did not get updated Should it be?? (I asked this since I am not sure if this is a set-up issue or an expected behavior)

E B
  • 1,073
  • 3
  • 23
  • 36

2 Answers2

0

Hive doesn't store (manage) any data files for EXTERNAL tables in the warehouse directory. It only stores the metadata for these tables in the Metastore.

This is the main difference between Hive internal (managed) and external tables. Internal table owns the data, external table only knows about it.

More detailed explanation here.

Sergey Khudyakov
  • 1,122
  • 1
  • 8
  • 15
0

To delete EXTERNAL table data, you need to delete it manually from HDFS location, Hive only deletes metadata in this case. To delete HDFS files, you can use simply rm command:

hadoop fs -rm /location_of_data

and use -rm -R if want to delete recursively.

Kenzzz
  • 85
  • 1
  • 7