Highest Voted 'biginsights' Questions

25

votes

4 answers

How to get hadoop put to create directories if they don't exist

I have been using Cloudera's hadoop (0.20.2). With this version, if I put a file into the file system, but the directory structure did not exist, it automatically created the parent directories: So for example, if I had no directories in hdfs and…

asked May 07 '14 at 16:41

owly

251
1
3
4

6

votes

2 answers

How to write data in the dataframe into single .parquet file(both data & metadata in single file) in HDFS?

How to write data in the dataframe into single .parquet file(both data & metadata in single file) in HDFS? df.show() --> 2 rows +------+--------------+----------------+ |…

apache-spark pyspark apache-spark-sql biginsights

asked Mar 15 '17 at 07:36

Shiva Ram

61
1
4

4

votes

2 answers

What is meaning of "Hadoop distribution"

I am new to hadoop. I recently read about basics of Apache Hadoop, Pig, Hive, HBase. Then I came across term "Hadoop distribution" and examples were Cloudera, MAPR, HortonWorks. So what is relation of Apache Hadoop (& its echo-system ) with "Hadoop…

hadoop cloudera software-distribution mapr biginsights

asked Feb 20 '16 at 10:06

Kaushik Lele

6,439
13
50
76

4

votes

1 answer

IBM BigInsights (IBM Hadoop) vs IBM Watson

What is the difference between IBM Watson and IBM Inforsphere BigInsights (IBM Hadoop)/Streams? What Watson brings to the table that BigInsights wouldn't?

stream ibm-watson biginsights

asked Jun 16 '15 at 22:21

Amir HZ

43
4

3

votes

2 answers

PYSPARK_PYTHON works with --deploy-mode client but not --deploy-mode cluster

I'm trying to run a python script using a custom python and deploy --deploy-mode cluster on an Enterprise 4.2 cluster. [biadmin@bi4c-xxxxx-mastermanager ~]$ hive hive> CREATE TABLE pokes (foo INT, bar STRING); OK Time taken: 2.147 seconds hive>…

apache-spark pyspark ibm-cloud hadoop-yarn biginsights

asked Dec 22 '16 at 16:48

Chris Snow

23,813
35
144
309

3

votes

1 answer

Installation BigInsights 4.2

I would like to ask you about instalation BigInsights 4.2 on centos 7. As far I know, now the instalation is only avaiable via kitematic or dockerhub, but kitematic is only avaiable for widnows or mac. If i want to install via dockerhub I have to…

ambari biginsights

asked Dec 22 '16 at 14:00

whizzkid

33
3

3

votes

0 answers

where does ${spark.yarn.app.container.log.dir} resolve to on BigInsights on cloud?

I'm trying to configure spark streaming logging. The spark docs state to set the following property: log4j.appender.file_appender=${spark.yarn.app.container.log.dir}/spark.log Where does spark.yarn.app.container.log.dir point to on a BigInsights…

log4j ibm-cloud spark-streaming biginsights

asked Dec 12 '16 at 14:00

Chris Snow

23,813
35
144
309

3

votes

1 answer

Error in installing H2O ai R package in BigInsights cluster in Bluemix

I have a 5 node BigInsights hadoop cluster in Bluemix. I am getting error, when I am trying to install H2O ai R in BigInsights cluster. install.packages("h2o", type="source",…

r linux ibm-cloud h2o biginsights

asked Aug 16 '16 at 10:10

Pari Margu

209
3
10

3

votes

2 answers

Hadoop Cannot set Reducers > 1

I am using Hadoop for a university assignment and I have the code working however im running into a small issue. I am trying to set the number of reducers to 19 ( which is 0.95 * capacity as the docs suggest). However when I view my job in the task…

java hadoop mapreduce reduce biginsights

asked May 16 '12 at 11:26

Nick

900
1
10
19

2

votes

0 answers

Spark Streaming not working on IBM BigInsights

I was testing a script that extracted tweets in real time using Spark Streaming. These tweets are supposed to be loaded into the IBM BigInsights hdfs environment. The script is written in python and I used yarn for cluster management. It runs fine…

python apache-spark hadoop-yarn biginsights

asked Jun 12 '17 at 04:49

pratikbhd

21
3

2

votes

1 answer

java.lang.ClassNotFoundException: Failed to find data source: com.cloudant.spark. in IBM BigInsights cluster

I have created an IBM BigInsights service instance with hadoop cluster of 5 nodes (including Apache Spark). I trying to use SparkR to connect a Cloudant Database, get some data, and do some processing. I have launched a SparkR shell(terminal) and…

ibm-cloud apache-spark-sql cloudant biginsights

asked Aug 05 '16 at 19:16

Pari Margu

209
3
10

2

votes

1 answer

spark script fails : java.net.ConnectException: Connection refused org.apache.hadoop.crypto.key.kms.KMSClientProvider.addDelegationTokens

I am trying to run a simple spark script on BigInsights on Cloud: lines = sc.textFile(license_filename, 1) counts = lines.flatMap(lambda x: x.split(' ')) \ .map(lambda x: (x, 1)) \ .reduceByKey(add) \ …

ibm-cloud biginsights

asked Jun 16 '16 at 11:08

Chris Snow

23,813
35
144
309

2

votes

0 answers

Error from python worker: /usr/bin/python No module named pyspark

I am trying to run Pyspark on Yarn, but I receive the following error, when I type any command on the console. I am able run scala shell in Spark in both local and yarn mode. Pyspark runs fine in local mode, but does not work in yarn mode. OS : RHEL…

python hadoop apache-spark pyspark biginsights

asked Sep 16 '15 at 12:28

akp

53
9

2

votes

0 answers

Oozie Workflow using Maven

I am trying to create an oozie application using IBM BigInsights. I believe to run the application on IBM BigInsights, the minimum folder structure should be: BiApp —> application —> application.xml —> workflow —> lib —> jar…

maven oozie maven-assembly-plugin biginsights

asked Jun 30 '15 at 20:43

KKa

408
4
19

2

votes

1 answer

How to programmatically read schema from header file in jaql?

I am trying to achieve the following in JAQL and am stuck. I have two files: File data.tsv, which contains tab separated data, and a file header.tsv, which contains exactly one line with tab separated values, corresponding to the "header" of file…

biginsights jjaql

asked Jun 29 '15 at 12:26

Blaubaer

654
1
5
15

Questions tagged [biginsights]