I copied some files from a directory to directory using
hadoop distcp -Dmapreduce.job.queuename=adhoc /user/comverse/data/$CURRENT_DATE_NO_DASH_*/*rcr.gz /apps/hive/warehouse/arstel.db/fair_usage/fct_evkuzmin04/file_rcr/
I stopped the scipt before it finished and the remained a lot of .distcp.tmp.attempt
and files that fnished moving in the dst directory
Now I want to clean the dst directory. After running
hadoop fs -rm -skipTrash /apps/hive/warehouse/arstel.db/fair_usage/fct_evkuzmin04/file_mta/*
most of the files were deleted, but some remained(at least that's what HUE shows). The strange thing is, every time I run hadoop fs -rm -skipTrash
, according to HUE, the number of remaining files changes to more or less.
I tried
hadoop fs -ls /apps/hive/warehouse/arstel.db/fair_usage/fct_evkuzmin04/file_mta/
and saw that some of the files that should be deleted were still there. Then I run
hadoop fs -rm -skipTrash /apps/hive/warehouse/arstel.db/fair_usage/fct_evkuzmin04/file_mta/*
a dozen more times and there were always more files to delete(There still are). What is happening?
ALSO
Each time I refresh the page in hue, the number of files grows. HALP.
EDIT
It seems that stopping distcp in the command line doesn't actually kill the job. That was the reason.