0

I’m trying to run a Virtual Apache Hadoop cluster on my laptop using Vagrant and Cloudera Manager following these instructions:

http://blog.cloudera.com/blog/2014/06/how-to-install-a-virtual-apache-hadoop-cluster-with-vagrant-and-cloudera-manager/

I’m using a Dell Precision M4800 Workstation Laptop with 16GB of RAM which runs an Ubuntu 16.04 LTS (Xenial Xerus) OS.

I successfully managed to install VirtualBox and Vagrant but I can’t connect to the nodes of my cluster, what I did was:

  1. configure the proxy settings for CLI tools:

    $export http_proxy="http://user:password@proxy_server:port"
    $export https_proxy="https://user:password@proxy_server:port""
    
  2. go into the project directory

  3. update the hosts file on each active machine:

    $vagrant hostmanager
    
  4. create and configure guest machines according to Vagrantfile

    $vagrant up
    
  5. Try to surf to http://vm-cluster-node1:7180 but got an error “server not found”

Since I am behind a corporate proxy I installed the vagrant proxyconf plugin, as suggested here: How to use vagrant in a proxy environment?

and than I changed my Vagrantfile adding the following lines:

if Vagrant.has_plugin?("vagrant-proxyconf")
  config.proxy.http     = "http://user:password@proxy_server:port" 
  config.proxy.https    = "https://user:password@proxy_server:port"
  config.proxy.no_proxy = "localhost,127.0.0.1"
end

the problem now is that after vagrant up command I get the following error:

==> master: Failed to fetch http://archive.cloudera.com/cm5/ubuntu/precise/amd64/cm/pool/contrib/e/enterprise/cloudera-manager-daemons_5.8.2-1.cm582.p0.17~precise-cm5_all.deb  Connection failed
==> master: Failed to fetch http://archive.cloudera.com/cm5/ubuntu/precise/amd64/cm/pool/contrib/o/oracle-j2sdk1.7/oracle-j2sdk1.7_1.7.0+update67-1_amd64.deb  Connection failed
==> master: E: Unable to fetch some archives, maybe run apt-get update or try with --fix-missing?
==> master: cloudera-scm-server-db: unrecognized service
==> master: cloudera-scm-server-db: unrecognized service
==> master: cloudera-scm-server: unrecognized service
The SSH command responded with a non-zero exit status. Vagrant assumes 
that this means the command failed. The output for this command should be 
in the log above. Please read the output to determine what went wrong.

What am I doing wrong?

Community
  • 1
  • 1
Cecilia
  • 487
  • 3
  • 6
  • 14
  • I have slightly changed the proxy configuration in *Vagrantfile* and now after *vargant up* I don't get any error message but I still can't connect to http://vm-cluster-node1:7180 – Cecilia Oct 20 '16 at 15:25
  • cupple of this you can check : is it up and running on the VM ? after you run vagrant host manager is the /etc/hosts file updated correctly ? – Frederic Henri Oct 20 '16 at 16:13
  • @FrédéricHenri yes, after I run vargant hostmanager the /etc/host file is updated correctly and I checked that the VM is up and running using VirtualBox GUI – Cecilia Oct 21 '16 at 08:43
  • I think the problem is with this line in *Vagrantfile*: apt-get -q -y --force-yes install oracle-j2sdk1.7 cloudera-manager-server-db cloudera-manager-server cloudera-manager-daemons – Cecilia Oct 21 '16 at 08:45
  • If I configure my phone as a portable hotspot and I use its network to do the process it works, so it's a problem related to the network proxy for sure. Can anyone help me? – Cecilia Oct 21 '16 at 15:25
  • You can stage these rpms locally and install them as well. Seems like you aren't able to reach archive.cloudera.com to download the binaries with your current setup. – wazy Oct 27 '16 at 20:28

1 Answers1

0

It turned out that it wasn't a proxy configuration problem (that configuration was correct) but it's a corporate firewall problem, the firewall allows only certain packages to be downloaded.

I have "solved" the problem by installing Cloudera Manager using my cellphone as a hotspot.

Once Cloudera Manager and Hadoop stack are installed on your cluster you can use Cloudera Manager Web GUI and manage your cluster in the corportate enviroment.

The only problem is that some important cluster features such as clock synchronization don't work properly in the corporate enviroment, in particular I found that my company firewall blocks NTP (the problem is better described here: https://askubuntu.com/questions/429306/ntpdate-no-server-suitable-for-synchronization-found)

Community
  • 1
  • 1
Cecilia
  • 487
  • 3
  • 6
  • 14