175

How to copy file from HDFS to the local file system . There is no physical location of a file under the file , not even directory . how can i moved them to my local for further validations.i am tried through winscp .

030
  • 10,842
  • 12
  • 78
  • 123
Surya
  • 3,408
  • 5
  • 27
  • 35

9 Answers9

287
  1. bin/hadoop fs -get /hdfs/source/path /localfs/destination/path
  2. bin/hadoop fs -copyToLocal /hdfs/source/path /localfs/destination/path
  3. Point your web browser to HDFS WEBUI(namenode_machine:50070), browse to the file you intend to copy, scroll down the page and click on download the file.
VeLKerr
  • 2,995
  • 3
  • 24
  • 47
Tariq
  • 34,076
  • 8
  • 57
  • 79
  • perfect tariq , i got the it ,**There is no physical location of a file under the file , not even directory .** bin/hadoop dfs -ls /use/hadoop/myfolder i can view the file , From i got the info as **To inspect the file, you can copy it from HDFS to the local file system** , so i though i can moved them from winscp – Surya Jul 24 '13 at 15:25
  • 2
    once again i need to mention tariq , thanks a lot for contributing you time and knowledge . thanks a lot . u did support a lot , this gives a lot of confidence for a new bie like me . – Surya Jul 24 '13 at 15:27
  • 1
    I see. You can actually use hdfs cat command if you wish to see the file's content or open the file on the webui. This will save you from downloading the file to your local fs. You are welcome. And if you are 100% satisfied with the answers to your questions you can mark them so that others can benefit from it.. Not for just this one, but in general. – Tariq Jul 24 '13 at 15:38
  • 2
    Just to add to my lat comment, if it is a binary file, cat won't show you the actual content. To view the content of a binary file you can use : bin/hadoop fs -text /path/to/file – Tariq Jul 24 '13 at 16:18
  • i tried using xml file , http://stackoverflow.com/q/17851462/2499617 , got Diskerror on slave can you please advise – Surya Jul 25 '13 at 07:26
  • 1
    It seems to be a bug(fixed). See the answer. – Tariq Jul 25 '13 at 08:13
  • It there a possibility to specify the modification/creation date of the files your copy? – marlieg Apr 20 '15 at 16:30
40

In Hadoop 2.0,

hdfs dfs -copyToLocal <hdfs_input_file_path> <output_path>

where,

  • hdfs_input_file_path maybe obtained from http://<<name_node_ip>>:50070/explorer.html

  • output_path is the local path of the file, where the file is to be copied to.

  • you may also use get in place of copyToLocal.

Ani Menon
  • 27,209
  • 16
  • 105
  • 126
25

In order to copy files from HDFS to the local file system the following command could be run:

hadoop dfs -copyToLocal <input> <output>

  • <input>: the HDFS directory path (e.g /mydata) that you want to copy
  • <output>: the destination directory path (e.g. ~/Documents)

Update: Hadoop is deprecated in Hadoop 3

use hdfs dfs -copyToLocal <input> <output>

CanCoder
  • 1,073
  • 14
  • 20
Hafiz Muhammad Shafiq
  • 8,168
  • 12
  • 63
  • 121
7

you can accomplish in both these ways.

1.hadoop fs -get <HDFS file path> <Local system directory path>
2.hadoop fs -copyToLocal <HDFS file path> <Local system directory path>

Ex:

My files are located in /sourcedata/mydata.txt I want to copy file to Local file system in this path /user/ravi/mydata

hadoop fs -get /sourcedata/mydata.txt /user/ravi/mydata/
Ramineni Ravi Teja
  • 3,568
  • 26
  • 37
6

If your source "file" is split up among multiple files (maybe as the result of map-reduce) that live in the same directory tree, you can copy that to a local file with:

hadoop fs -getmerge /hdfs/source/dir_root/ local/destination
Eponymous
  • 6,143
  • 4
  • 43
  • 43
  • This should be accepted. This is what most people are looking for, not a split up file. – James O'Brien Aug 17 '19 at 19:18
  • This would be the best answer to be honest. Usually all HDFS files/tables are separated like 0000_0, 0001_0 in those directory. `-getmerge` will merge all those and put in into 1 files in local directory. Kudos to @Eponymous – didi Apr 19 '21 at 02:48
3

This worked for me on my VM instance of Ubuntu.

hdfs dfs -copyToLocal [hadoop directory] [local directory]

Zach
  • 51
  • 7
1

1.- Remember the name you gave to the file and instead of using hdfs dfs -put. Use 'get' instead. See below.

$hdfs dfs -get /output-fileFolderName-In-hdfs

0

if you are using docker you have to do the following steps:

  1. copy the file from hdfs to namenode (hadoop fs -get output/part-r-00000 /out_text). "/out_text" will be stored on the namenode.

  2. copy the file from namenode to local disk by (docker cp namenode:/out_text output.txt)

  3. output.txt will be there on your current working directory

Arslan
  • 23
  • 1
  • 5
-3
bin/hadoop fs -put /localfs/destination/path /hdfs/source/path 
Brian Tompsett - 汤莱恩
  • 5,753
  • 72
  • 57
  • 129