Questions tagged [httpfs]

HttpFS is a server that provides a REST HTTP gateway supporting all HDFS File System operations (read and write). And it is inteoperable with the webhdfs REST HTTP API.

HttpFS is a server that provides a REST HTTP gateway supporting all HDFS File System operations (read and write). And it is inteoperable with the webhdfs REST HTTP API.

HttpFS can be used to transfer data between clusters running different versions of Hadoop (overcoming RPC versioning issues), for example using Hadoop DistCP.

HttpFS can be used to access data in HDFS on a cluster behind of a firewall (the HttpFS server acts as a gateway and is the only system that is allowed to cross the firewall into the cluster).

HttpFS can be used to access data in HDFS using HTTP utilities (such as curl and wget) and HTTP libraries Perl from other languages than Java.

The webhdfs client FileSytem implementation can be used to access HttpFS using the Hadoop filesystem command (hadoop fs) line tool as well as from Java aplications using the Hadoop FileSystem Java API.

HttpFS has built-in security supporting Hadoop pseudo authentication and HTTP SPNEGO Kerberos and other pluggable authentication mechanims. It also provides Hadoop proxy user support.

Official WebSite: https://hadoop.apache.org/docs/r2.4.1/hadoop-hdfs-httpfs/

15 questions
12
votes
2 answers

httpfs error Operation category READ is not supported in state standby

I am working on hadoop apache 2.7.1 and I have a cluster that consists of 3 nodes nn1 nn2 dn1 nn1 is the dfs.default.name, so it is the master name node. I have installed httpfs and started it of course after restarting all the services. When nn1 is…
oula alshiekh
  • 843
  • 5
  • 14
  • 40
4
votes
0 answers

How to stream downloads using Scalaj-Http and Hadoop HttpFs

My question is how to use a Buffered stream when using Scalaj-Http. I have written the following code which is a complete working example that will download a file from Hadoop HDFS using HttpFS. My goal is to handle very large files and this will…
John Hanley
  • 74,467
  • 6
  • 95
  • 159
4
votes
2 answers

failover is not fired when active name node crashes

I am using Apache Hadoop-2.7.1 on cluster that consists of three nodes nn1 master name node nn2 (second name node) dn1 (data node) i have configured high availability,and nameservice and zookeeper is working in all three nodes and it is started…
oula alshiekh
  • 843
  • 5
  • 14
  • 40
2
votes
1 answer

F#- Using HttpFs.Client and Hopac: How do I get a response code, response headers and response cookies?

I am using F# with HttpFs.Client and Hopac. I am able to get Response body and value of each node of JSON/XML response by using code like: [] let ``Test a Create user API``() = let response = Request.createUrl Post…
1
vote
1 answer

WebHDFS FileNotFoundException rest api

I am posting this question as a continuation of post webhdfs rest api throwing file not found exception I have an image file I would like to OPEN through the WebHDFS rest api. the file exists in hdfs and has appropriate permissions I can LISTSTATUS…
1
vote
1 answer

Authentification in cURL from Windows to Hadoop HTTPFS secured with Kerberos

I want to load data from my local Windows machine to HDFS using HTTPFS with curl. The Hadoop cluster is secured with Kerberos. How do I manage to get the authentication done? When trying the following statement... curl -k --negotiate -u : -i -X PUT…
Rob
  • 369
  • 2
  • 3
  • 13
1
vote
1 answer

HttpFs benefit over high availability and nameservice

I am using Apache Hadoop-2.7.1 on cluster that consists of three nodes nn1 master name node nn2 (second name node) dn1 (data node) we know that if we configure high availability in this cluster we will have two main nodes, one is active and…
oula alshiekh
  • 843
  • 5
  • 14
  • 40
0
votes
1 answer

F# with Http.fs - not able to execute GraphQL APIs

I don't see any good documentation about how to execute GraphQL APIs using F# with Http.fs Kindly share if you have the correct syntax available or point to the correct documentation for the same. I was trying with the Star Wars API given here:…
0
votes
1 answer

F#- How can we validate the whole schema of API response using HttpFs.Client or Hopac?

I have a test where after getting a response I would like to validate the entire schema of the response (not individual response node/value comparison). Sample test: []
 let howtoValidateSchema () =
 let request =…
0
votes
1 answer

What is the most efficient solution for hundreds download requests in minute for HDFS folder

In my company, we have a continuous learning process. Every 5-10 minutes we create a new model in HDFS. Model is a folder of several files: model ~ 1G (binary file) model metadata 1K (text file) model features 1K (csv file) ... On the…
Julias
  • 5,752
  • 17
  • 59
  • 84
0
votes
1 answer

WebHDFS/HttpFS in CDH via Docker

i'm using cloudera quickstart via Docker Toolbox (docker for win10 home). The CDH version is 5.7 i'm trying to connect to hdfs with Webhdfs/HttpFS, i dont sure if the port is 50070 or 14000. here is the list of ports in CDH 5.7 1) I'm actualy dont…
Moti Shaul
  • 73
  • 1
  • 8
0
votes
0 answers

NullPointerException when trying to access a directory using HttpFS

I have a cluster running Hadoop 2.6.0-cdh5.4.1. I want to create a file inside a directory using webhdfs rest api. I have 2 directories called directory1 and directory2, both in /. They both have the same permissions(711), owner and group. The…
anegru
  • 1,023
  • 2
  • 13
  • 29
0
votes
1 answer

Httpfs Create or append file with httpclient

I use HTTPCLIENT to create or append file with rest component HTTPFS. exemple cmd curl - working with curl curl -i -X PUT -s --negotiate -u : "http://httpfsServer:14000/webhdfs/v1/user/a_app_kpi/tmp/testAppend.txt?op=CREATE&data=true" --header…
V.HL
  • 80
  • 6
0
votes
1 answer

How to implement a rename function for a HTTP based file server?

I have to implement a HTTP server with some file server capabilities. I'd already coded HTTP HEAD, GET, PUT, and DELETE requests. Next thing I need to implement something like RENAME or MOVE to change the name of a file which is already stored on…
Joe
  • 3,090
  • 6
  • 37
  • 55
0
votes
1 answer

httpfs for hadoop apache Download

Iam using Apache Hadoop-2.7.1 on Centos 7 Operating System. To setup HttpFs, this link suggests to install HttpFs. I do not find any binary available for it. Is there an alternative method to configure HttpFs for Hadoop?
oula alshiekh
  • 843
  • 5
  • 14
  • 40