1

I have six-node cluster and I want to install HAWQ database and PXF on it. My cluster looks like that:

Node1 - NameNode, ResourceManager, HiveMetastore, HiveClient
Node2 - SNameNode, NodeManager
Node3 - DataNode, NodeManager
Node4 - DataNode, NodeManager
Node5 - DataNode, NodeManager
Node6 - HiveClient

On which node I have to install HAWQ Master, HAWQ Segments and PXF? Is it possible to do this only on three first nodes or I have to install HAWQ Segments and PXF on each node?

Mrgr8m4
  • 477
  • 3
  • 9
  • 29

1 Answers1

2

I would install the Master on Node6, Standby on Node2, and the Segments with PXF on Node3, Node4, and Node5. You want a Segment on every Datanode.

I also would avoid putting the Master or Standby on the same node you have Ambari. Ambari uses a PostgreSQL database and listens on port 5432 which is the same port used by HAWQ. You can change either Ambari or HAWQ to make it work on the same node but it is just easier to keep these separate.

Jon Roberts
  • 2,068
  • 1
  • 9
  • 11
  • In the documentation I read something like that - "Note: PXF must be installed on the HDFS NameNode, the Standby NameNode (if configured), and on each HDFS DataNode. A HAWQ segment must be installed on each HDFS DataNode". Does it mean that I have to install PXF on Node 1 and Node 2 too? Otherwise it won't work? – Mrgr8m4 Jan 03 '18 at 10:29