I am copying all files from one Hadoop cluster to another Hadoop cluster using distcp. On 1st attempt copied all data but on 2nd back of data getting exception DuplicateFileException (Records would cause duplicates). for more detail check bellow log stack.
i tried ./bin/hadoop distcp -update hdfs://XXXXXXXXX:8020/* hdfs://XXXXXXXXX:9000/ bin/hadoop distcp -p -log -i -overwrite hdfs://XXXXXXXXX:8020/* hdfs://XXXXXXXXX:9000/
ERROR tools.DistCp: Duplicate files in input path:
org.apache.hadoop.tools.CopyListing$DuplicateFileException: File hdfs://192.168.1.22:8020/original/10000 Sales Records and hdfs://192.168.1.22:8020/sample/10000 Sales Records would cause duplicates. Aborting
at org.apache.hadoop.tools.CopyListing.validateFinalListing(CopyListing.java:160)
at org.apache.hadoop.tools.CopyListing.buildListing(CopyListing.java:91)
at org.apache.hadoop.tools.GlobbedCopyListing.doBuildListing(GlobbedCopyListing.java:90)
at org.apache.hadoop.tools.CopyListing.buildListing(CopyListing.java:84)
at org.apache.hadoop.tools.DistCp.createInputFileListing(DistCp.java:382)
at org.apache.hadoop.tools.DistCp.createAndSubmitJob(DistCp.java:181)