0

I am new to scalding world. My scalding job will have multiple stages, and I need to tune each stage individually.

I have found that we might be able to change the number of reducers by using withReducers. Also, I am able to set the split size for the input data by the job config. However, I didn't see there is any way to change the number of mappers for my sub-tasks on the fly.

Did I miss something? Does anyone know how to specify the number of mappers for my sub-tasks? Thanks.

WarfDog
  • 11
  • 4

1 Answers1

0

Got some answers/ideas might be helpful for someone else who shared the same question.

It is much easier to control reducers compared to mappers.

Mappers are controlled by hadoop without a similar simple knob. You can set some config parameters to give hadoop an idea of how many map tasks to launch.

This stack overflow may be helpful: Setting the number of map tasks and reduce tasks

One workaround I could think of is changing your major task to small ones, which you could individually tweak the size (# of mappers) of your input data.

Community
  • 1
  • 1
WarfDog
  • 11
  • 4