Questions tagged [cloudera]

Cloudera Inc. is a Palo Alto-based enterprise software company which provides Apache Hadoop-based software and services.

Cloudera, the commercial Hadoop company, develops and distributes Hadoop, the open source software that powers the data processing engines of the world’s largest and most popular websites.

Cloudera's Distribution including Apache Hadoop (CDH) is a free package built from the powerful, flexible, scalable Apache Hadoop software. To help you learn about Hadoop and how to use it, Cloudera offers public and private training, certification and online courseware.

Useful Links

Related Tags

2533 questions

votes

12 answers

Buiding Hadoop with Eclipse / Maven - Missing artifact jdk.tools:jdk.tools:jar:1.6

I am trying to import cloudera's org.apache.hadoop:hadoop-client:2.0.0-cdh4.0.0 from cdh4 maven repo in a maven project in eclipse 3.81, m2e plugin, with oracle's jdk 1.7.0_05 on win7 using org.apache.hadoop …

java maven maven-2 hadoop cloudera

asked Jun 20 '12 at 10:57

jvataman

1,357
1
12
13

votes

3 answers

How to check Spark Version

I want to check the spark version in cdh 5.7.0. I have searched on the internet but not able to understand. Please help.

apache-spark hadoop cloudera

asked Jul 26 '16 at 10:03

Ironman

1,330
2
19
40

votes

4 answers

Where are logs in Spark on YARN?

I'm new to spark. Now I can run spark 0.9.1 on yarn (2.0.0-cdh4.2.1). But there is no log after execution. The following command is used to run a spark example. But logs are not found in the history server as in a normal MapReduce…

hadoop logging apache-spark cloudera hadoop-yarn

asked Apr 14 '14 at 11:15

DeepNightTwo

4,809
8
46
60

votes

5 answers

Find port number where HDFS is listening

I want to access hdfs with fully qualified names such as : hadoop fs -ls hdfs://machine-name:8020/user I could also simply access hdfs with hadoop fs -ls /user However, I am writing test cases that should work on different distributions(HDP,…

hadoop hdfs cloudera hortonworks-data-platform mapr

asked Oct 06 '14 at 13:05

ernesto

1,899
4
26
39

votes

4 answers

How to get hadoop put to create directories if they don't exist

I have been using Cloudera's hadoop (0.20.2). With this version, if I put a file into the file system, but the directory structure did not exist, it automatically created the parent directories: So for example, if I had no directories in hdfs and…

hadoop hdfs cloudera put biginsights

asked May 07 '14 at 16:41

owly

votes

7 answers

JsonParseException: Unrecognized token 'http': was expecting ('true', 'false' or 'null')

We have the following string which is a valid JSON written to a file on HDFS. { "id":"tag:search.twitter.com,2005:564407444843950080", "objectType":"activity", "actor":{ "objectType":"person", "id":"id:twitter.com:2302910022", …

java jackson cloudera

asked Feb 19 '15 at 10:48

Fanooos

2,718
5
31
55

votes

3 answers

Impala can't access all hive table

I try to query hbase data through hive (I'm using cloudera). I did a fiew hive external table pointing to hbase but the thing is Cloudera's Impala doesn't have an access to all those tables. All hive external tables appear in the metastore manager…

hadoop hive cloudera hue impala

asked Dec 10 '13 at 16:44

Nosk

votes

5 answers

issue Running Spark Job on Yarn Cluster

I want to run my spark Job in Hadoop YARN cluster mode, and I am using the following command: spark-submit --master yarn-cluster --driver-memory 1g --executor-memory 1g --executor-cores 1 …

hadoop apache-spark hdfs hadoop-yarn cloudera

asked Feb 24 '15 at 06:57

Sachin Singh

votes

4 answers

How to find cdh version hadoop

When connecting to Hadoop cluster, how can I know which version of Hadoop this cluster is running? In particular this is important for proper configuration of libraries when compiling and packaging Hadoop Java jobs with Maven.

hadoop cloudera

asked Jul 06 '14 at 22:59

Vladimir Kroz

5,237
6
39
50

votes

3 answers

Rstudio-server environment variables not loading?

I'm trying to run rhadoop on Cloudera's hadoop distro (I can't remember if its CDH3 or 4), and am running into an issue: Rstudio server doesn't seem to recognize my global variables. In my /etc/profile.d/r.sh file, I have: export…

r hadoop cloudera rstudio rstudio-server

asked Jun 01 '13 at 00:11

AI52487963

1,253
2
17
36

votes

2 answers

Got InterruptedException while executing word count mapreduce job

I have installed Cloudera VM version 5.8 on my machine. When I execute word count mapreduce job, it throws below exception. `16/09/06 06:55:49 WARN hdfs.DFSClient: Caught exception java.lang.InterruptedException at java.lang.Object.wait(Native…

hadoop mapreduce cloudera hortonworks-data-platform hortonworks-sandbox

asked Sep 06 '16 at 14:35

PUSHPAK GOHEY

votes

6 answers

Accessing Hue on Cloudera Docker QuickStart

I have installed the cloudera quickstart using docker based on the instructions given here. https://blog.cloudera.com/blog/2015/12/docker-is-the-new-quickstart-option-for-apache-hadoop-and-cloudera/ docker run --privileged=true…

hadoop docker cloudera cloudera-quickstart-vm

asked Dec 21 '15 at 03:21

Knows Not Much

30,395
60
197
373

votes

1 answer

yarn is not honouring yarn.nodemanager.resource.cpu-vcores

I am using Hadoop-2.4.0 and my system configs are 24 cores, 96 GB RAM. I am using following…

hadoop mapreduce cloudera hadoop-yarn hadoop2

asked Aug 29 '14 at 07:42

banjara

3,800
3
38
61

votes

5 answers

Spark : check your cluster UI to ensure that workers are registered

I have a simple program in Spark: /* SimpleApp.scala */ import org.apache.spark.SparkContext import org.apache.spark.SparkContext._ import org.apache.spark.SparkConf object SimpleApp { def main(args: Array[String]) { val conf = new…

scala hadoop apache-spark cloudera cloudera-manager

asked Feb 26 '16 at 22:02

vineet sinha

votes

4 answers

What is the correct way to start/stop spark streaming jobs in yarn?

I have been experimenting and googling for many hours, with no luck. I have a spark streaming app that runs fine in a local spark cluster. Now I need to deploy it on cloudera 5.4.4. I need to be able to start it, have it run in the background…

hadoop apache-spark spark-streaming hadoop-yarn cloudera

asked Jul 28 '15 at 18:25

Kevin Pauli

8,577
15
49
70

2 3

…

99 100 Next