Erlang - too many processes

Question

these are my first steps in Erlang so sorry for this newbie question :) I'm spawning a new Erlang process for every Redis request which is not what I want to ("Too many processes" at 32k Erlang processes) but how to throttle the amount of the processes to e.g. max. 16?

-module(queue_manager).
-export([add_ids/0, add_id/2]).

add_ids() ->
    {ok, Client} = eredis:start_link(),
    do_spawn(Client, lists:seq(1,100000)).

do_spawn(Client, [H|T]) ->
    Pid = spawn(?MODULE, add_id, [Client, H]),
    do_spawn(Client, T);

do_spawn(_, []) -> none.

add_id(C, Id) ->
    {ok, _} = eredis:q(C, ["SADD", "todo_queue", Id]).

Optionally you can increase the number of processes using the +P flag: http://erlang.org/pipermail/erlang-questions/2002-December/006329.html — hexist, Oct 23 '12 at 15:23

lastcanal · Accepted Answer · 2012-10-24T20:29:43.817

Try using the Erlang pg2 module. It allows you to easliy create process groups and provides an API to get the 'closest' (or a random) PID in the group.

Here is an example of a process group for the eredis client:

-module(redis_pg).

-export([create/1,
         add_connections/1, 
         connection/0,
         connections/0,
         q/1]).

create(Count) ->
    % create process group using the module name as the reference
    pg2:create(?MODULE),
    add_connections(Count).

% recursive helper for adding +Count+ connections
add_connections(Count) when Count > 0 ->
    ok = add_connection(),
    add_connections(Count - 1);
add_connections(_Count) -> 
    ok.

add_connection() ->
    % start redis client connection
    {ok, RedisPid} = eredis:start_link(),
    % join the redis connection PID to the process group
    pg2:join(?MODULE, RedisPid).

connection() ->
    % get a random redis connection PID
    pg2:get_closest_pid(?MODULE).

connections() ->
    % get all redis connection PIDs in the group
    pg2:get_members(?MODULE).

q(Argv) ->
    % execute redis command +Argv+ using random connection
    eredis:q(connection(), Argv).

Here is an example of the above module in action:

1> redis_pg:create(16).
ok
2> redis_pg:connection().
<0.68.0>
3> redis_pg:connection().
<0.69.0>
4> redis_pg:connections().
[<0.53.0>,<0.56.0>,<0.57.0>,<0.58.0>,<0.59.0>,<0.60.0>,
 <0.61.0>,<0.62.0>,<0.63.0>,<0.64.0>,<0.65.0>,<0.66.0>,
 <0.67.0>,<0.68.0>,<0.69.0>,<0.70.0>]
5> redis_pg:q(["PING"]).  
{ok,<<"PONG">>}

You might want to add supervision to recover crashed connections. How do you make sure the group member you get is not busy? You are getting closer to a connection pool which manages both. — Tilman, Oct 25 '12 at 15:30
It is a good idea to add process monitoring to the eredis connections for 'DOWN' messages so you can re-add a connection to the group. If you want the least busy connection you can modify the `connection` function to iterate over the all the pg members and find the process with the smallest queue using `erlang:process_info(Pid, message_queue_len).` — lastcanal, Oct 25 '12 at 16:35

score 2 · Answer 2 · edited May 23 '17 at 12:09

2

You could use a connection pool, e.g., eredis_pool. This is a similar question which might be interesting for you.

edited May 23 '17 at 12:09

Community

1
1

answered Oct 23 '12 at 16:05

Tilman

2,015
14
16

score 1 · Answer 3 · answered Oct 23 '12 at 16:45

You can use a supervisor to launch each new process (for your example it seems that you should use a simple_one_for_one strategy):

supervisor:start_child(SupRef, ChildSpec) -> startchild_ret().

You can access then to the process count using the function

supervisor:count_children(SupRef) -> PropListOfCounts.

The result is a proplist of the form

[{specs,N1},{active,N2},{supervisors,N3},{workers,N4}] (the order is not guaranteed!)

If you want more information about active processes, you can also use

supervisor:which_children(SupRef) -> [{Id, Child, Type, Modules}] but this is not recommended when a supervisor manage a "large" amount of children.

score 1 · Answer 4 · answered Oct 24 '12 at 14:05

You are basically "on your own" when you implement limits. There are certain tools which will help you, but I think the general question "how do I avoid spawning too many processes?" still holds. The trick is to keep track of the process count somewhere.

An Erlang-idiomatic way would be to have a process which contains a counter. Whenever you want to spawn a new process, you ask it if you are allowed to do so by registering a need for tokens against it. You then wait for the counting process to respond back to you.

The counting process is then a nice modular guy maintaining a limit for you.

Erlang - too many processes

4 Answers4