3

these are my first steps in Erlang so sorry for this newbie question :) I'm spawning a new Erlang process for every Redis request which is not what I want to ("Too many processes" at 32k Erlang processes) but how to throttle the amount of the processes to e.g. max. 16?

-module(queue_manager).
-export([add_ids/0, add_id/2]).

add_ids() ->
    {ok, Client} = eredis:start_link(),
    do_spawn(Client, lists:seq(1,100000)).

do_spawn(Client, [H|T]) ->
    Pid = spawn(?MODULE, add_id, [Client, H]),
    do_spawn(Client, T);

do_spawn(_, []) -> none.

add_id(C, Id) ->
    {ok, _} = eredis:q(C, ["SADD", "todo_queue", Id]).
ctp
  • 1,077
  • 1
  • 10
  • 28
  • 2
    Optionally you can increase the number of processes using the +P flag: http://erlang.org/pipermail/erlang-questions/2002-December/006329.html – hexist Oct 23 '12 at 15:23

4 Answers4

6

Try using the Erlang pg2 module. It allows you to easliy create process groups and provides an API to get the 'closest' (or a random) PID in the group.

Here is an example of a process group for the eredis client:

-module(redis_pg).

-export([create/1,
         add_connections/1, 
         connection/0,
         connections/0,
         q/1]).

create(Count) ->
    % create process group using the module name as the reference
    pg2:create(?MODULE),
    add_connections(Count).

% recursive helper for adding +Count+ connections
add_connections(Count) when Count > 0 ->
    ok = add_connection(),
    add_connections(Count - 1);
add_connections(_Count) -> 
    ok.

add_connection() ->
    % start redis client connection
    {ok, RedisPid} = eredis:start_link(),
    % join the redis connection PID to the process group
    pg2:join(?MODULE, RedisPid).

connection() ->
    % get a random redis connection PID
    pg2:get_closest_pid(?MODULE).

connections() ->
    % get all redis connection PIDs in the group
    pg2:get_members(?MODULE).

q(Argv) ->
    % execute redis command +Argv+ using random connection
    eredis:q(connection(), Argv).

Here is an example of the above module in action:

1> redis_pg:create(16).
ok
2> redis_pg:connection().
<0.68.0>
3> redis_pg:connection().
<0.69.0>
4> redis_pg:connections().
[<0.53.0>,<0.56.0>,<0.57.0>,<0.58.0>,<0.59.0>,<0.60.0>,
 <0.61.0>,<0.62.0>,<0.63.0>,<0.64.0>,<0.65.0>,<0.66.0>,
 <0.67.0>,<0.68.0>,<0.69.0>,<0.70.0>]
5> redis_pg:q(["PING"]).  
{ok,<<"PONG">>}
lastcanal
  • 2,145
  • 14
  • 17
  • You might want to add supervision to recover crashed connections. How do you make sure the group member you get is not busy? You are getting closer to a connection pool which manages both. – Tilman Oct 25 '12 at 15:30
  • It is a good idea to add process monitoring to the eredis connections for 'DOWN' messages so you can re-add a connection to the group. If you want the least busy connection you can modify the `connection` function to iterate over the all the pg members and find the process with the smallest queue using `erlang:process_info(Pid, message_queue_len).` – lastcanal Oct 25 '12 at 16:35
2

You could use a connection pool, e.g., eredis_pool. This is a similar question which might be interesting for you.

Community
  • 1
  • 1
Tilman
  • 2,015
  • 14
  • 16
1

You can use a supervisor to launch each new process (for your example it seems that you should use a simple_one_for_one strategy):

supervisor:start_child(SupRef, ChildSpec) -> startchild_ret().

You can access then to the process count using the function

supervisor:count_children(SupRef) -> PropListOfCounts.

The result is a proplist of the form

[{specs,N1},{active,N2},{supervisors,N3},{workers,N4}] (the order is not guaranteed!)

If you want more information about active processes, you can also use

supervisor:which_children(SupRef) -> [{Id, Child, Type, Modules}] but this is not recommended when a supervisor manage a "large" amount of children.

Pascal
  • 13,977
  • 2
  • 24
  • 32
1

You are basically "on your own" when you implement limits. There are certain tools which will help you, but I think the general question "how do I avoid spawning too many processes?" still holds. The trick is to keep track of the process count somewhere.

An Erlang-idiomatic way would be to have a process which contains a counter. Whenever you want to spawn a new process, you ask it if you are allowed to do so by registering a need for tokens against it. You then wait for the counting process to respond back to you.

The counting process is then a nice modular guy maintaining a limit for you.

I GIVE CRAP ANSWERS
  • 18,739
  • 3
  • 42
  • 47