Highest Voted 'partitioner' Questions

13

votes

2 answers

how to sort word count by value in hadoop?

hi i wanted to learn how to sort the word count by value in hadoop.i know hadoop takes of sorting keys, but not by values. i know to sort the values we must have a partitioner,groupingcomparator and a sortcomparator but i am bit confused in applying…

asked Aug 23 '13 at 13:16

user1585111

1,019
6
19
35

11

votes

2 answers

Why does sortBy transformation trigger a Spark job?

As per Spark documentation only RDD actions can trigger a Spark job and the transformations are lazily evaluated when an action is called on it. I see the sortBy transformation function is applied immediately and it is shown as a job trigger in the…

apache-spark rdd partitioning partitioner

asked Dec 30 '16 at 22:49

Prabu Soundar Rajan

799
1
8
14

9

votes

2 answers

Difference between combiner and partitioner

I am a newbie to MapReduce and I just can't figure out the difference in the partitioner and combiner. I know both run in the intermediate step between the map and reduce tasks and both reduce the amount of data to be processed by the reduce task.…

hadoop mapreduce partitioner

asked Jul 25 '16 at 08:26

harshit

333
1
2
13

7

votes

2 answers

In Hadoop Map-Reduce, does any class see the whole list of keys after sorting and before partitioning?

I am using Hadoop to analyze a very uneven distribution of data. Some keys have thousands of values, but most have only one. For example, network traffic associated with IP addresses would have many packets associated with a few talkative IPs and…

java hadoop mapreduce partitioning partitioner

asked Aug 24 '12 at 21:47

Jim Pivarski

5,568
2
35
47

6

votes

0 answers

Using KeyFieldBasedPartitioner and Secondary Sorting in Java Hadoop similar to Hadoop Streaming

When using Hadoop streaming, the partitioner and sorter can be set and configurated like this: hadoop jar /opt/hadoop/hadoop-2.7.1/share/hadoop/tools/lib/hadoop-streaming-2.7.1.jar \ -D mapreduce.map.output.key.field.separator=. \ -D…

java hadoop partitioner

asked Oct 24 '15 at 15:49

irondwarf

195
1
8

6

votes

2 answers

Hadoop partitioner

I want to ask about Hadoop partitioner ,is it implemented within Mappers?. How to measure the performance of using the default hash partitioner - Is there better partitioner to reducing data skew? Thanks

hadoop mapreduce partitioner

asked Dec 22 '14 at 00:14

Nada Ghanem

451
6
16

6

votes

1 answer

Hadoop send record to all reducers

How can I send a specific record to all my reducers ? I know the Partitioner class and what it does, but I don't see any easy way of making sure a record goes to all the reducers. Basically, the Partitioner has this method: int getPartition(K2…

hadoop mapreduce partitioning reduce partitioner

asked Aug 22 '12 at 23:36

Razvan

9,925
6
38
51

4

votes

3 answers

Using a partitioner in C# to parallel query a REST-API with pagination

I was wondering if my approach is good to query a REST-API in parallel because there is a limit on how many results can be obtained with one request (1000). To speed up things I want to do this in parallel. The idea is to use a partitioner to create…

c# parallel-processing .net-7.0 plinq partitioner

asked Mar 23 '23 at 17:42

Alexander Schmidt

41
2

4

votes

1 answer

Why is a parallel-processing much slower for a first call in C#?

I am trying to process numbers as fast as possible with C# app. I use a Thread.Sleep() to simulate a processing and random numbers. I use 3 different techniques. This is test code that I used: using System; using System.Collections.Concurrent; using…

c# parallel-processing task parallel.foreach partitioner

asked Oct 31 '17 at 12:33

Pavol

552
8
19

4

votes

1 answer

How to allow different keyspaces to use different partitioners in Cassandra?

I am new to Cassandra and have a basic question regarding its partitioners. According to the Cassandra document, the partitioner of a cluster should be set in the cassandra.yaml file. My question is: does this mean all keyspaces in a Cassandra…

cassandra partitioner

asked Aug 08 '13 at 06:50

keelar

5,814
7
40
79

2

votes

2 answers

The default Kafka partitioner create hash key collision

I have a topic with 10 partitions, and I have generate events with A,B,C,D,E,F,G,H,I 9 different keys. I've observed messages doing this: Partition 0- (Message1, Key E), (Message2, Key I) Partition 1- (Message3, Key F) . . Partition7-(Message4,…

hash apache-kafka key partitioner

asked May 31 '19 at 19:26

Dipperman

119
1
12

2

votes

2 answers

How to write Kafka Consumer Client in java to consume the messages from multiple brokers?

I was looking for java client (Kafka Consumer) to consume the messages from multiple brokers. please advice Below is the code written to publish the messages to multiple brokers using simple partitioner. Topic is created with replication factor "2"…

java apache-kafka kafka-consumer-api partitioner

asked Mar 28 '17 at 12:31

Gopi

619
2
9
27

2

votes

3 answers

What's the difference between shuffle phase and combiner phase?

i'm pretty confused about the MapReduce Framework. I'm getting confused reading from different sources about that. By the way, this is my idea of a MapReduce Job 1. Map()-->emit 2. Partitioner (OPTIONAL) --> divide intermediate…

hadoop mapreduce combiners partitioner

asked Oct 06 '16 at 10:09

rollotommasi

461
1
6
11

2

votes

1 answer

repartition and sort within partition and custom partitioner in spark giving array out of bound exception

6 I tried to implement what is explained here. It is working when i keep number of partition in custom partition equal to one but when i change this keep any other value it gives out array out of bound exception Exception in thread "main"…

apache-spark partitioner

asked Jun 10 '16 at 08:50

deenbandhu

599
5
18

2

votes

0 answers

How to avoid input traffic increase for Kafka brokers when using a custom partitioner?

In order to smooth traffic between all Kafka partitions, I tried to make a custom partitioner (extending kafka.producer.Partitioner) on my producers to replace default partitioner that only change partitions every 10 minutes. My partitioner uses a…

apache-kafka partitioner

asked Jun 01 '15 at 12:30

Sebastien Falquier

21
2

Questions tagged [partitioner]