How to retrieve a list of ets keys without scanning entire table?

Question

I'm using ets via elixir as a simple in-memory persistence layer to store and retrieve keys and also for the occasional foldl which involves reducing many duplicate keys that have different values. I am using the bag option.

Is there a simple, perhaps O(1) way to retrieve a list of just the current keys without having to do a more involved table traversal or match or foldl?

Erlang or Elixir syntax responses welcome.

:ets.new(:cache, [:bag, :named_table, :protected])

I have a static map of atom keys indexed by integer I am using to assist with insertions. But not all the keys are used..

chunk_key_map = %{2 => :chunk_key_2, ..... 28 => :chunk_key_28}

If there's no quick way, I am aware I can do an ets:lookup trying each of my static atom key values and testing for != [] and generating my own list, but wanted to see if ets supports such a feature.

Thanks

score 5 · Answer 1 · edited Dec 10 '17 at 12:22

5

Thanks, that put me on the right track :)

Same thing but passing the previous key as accumulator:

def key_stream(table_name) do
  Stream.resource(
    fn -> :ets.first(table_name) end,
    fn :"$end_of_table" -> {:halt, nil}
       previous_key -> {[previous_key], :ets.next(table_name, previous_key)} end,
    fn _ -> :ok end)
end

edited Dec 10 '17 at 12:22

link0ff

2,744
1
22
15

answered May 08 '17 at 08:17

ybycode

171
3
17

bibekp · Accepted Answer · 2016-02-02T23:52:06.690

So I didn't find the ets technique, but rather implemented the key list retrieve code in constant time in elixir as my key map is static.

    list = Enum.reduce(2..28, [], fn head, acc -> 
            case :ets.lookup(:cache, Map.get(chunk_key_map, head)) do
                [] -> acc
                _ -> [acc, head]
            end
        end)

    List.flatten(list)

UPDATE: Based on the replies, I took Hamidreza's ets traversal logic and wrapped it into an Elixir Stream using Stream.resource/3.

defp get_ets_keys_lazy(table_name) when is_atom(table_name) do
    eot = :"$end_of_table"

    Stream.resource(
        fn -> [] end,

        fn acc ->
            case acc do
                [] -> 
                    case :ets.first(table_name) do
                        ^eot -> {:halt, acc}
                        first_key -> {[first_key], first_key}                       
                    end

                acc -> 
                    case :ets.next(table_name, acc) do  
                        ^eot -> {:halt, acc}
                        next_key -> {[next_key], next_key}
                    end
            end
        end,

        fn _acc -> :ok end
    )
end

Then I ran the stream through a pipeline

get_ets_keys_lazy(table_name) 
    |> Stream.map(lambda1) 
    |> Stream.each(lambda2)
    |> Stream.run

The problem with this code is that you end up copying all the elements in the ETS table into your process. Remember that ETS tables are kept "outside" of all process so all access to ETS tables is through copying. One major reason for `match`, `match_object` and `select` is that it allows you to be more specific in what elements are copied, testing can be done while the element data is still "in the table". — rvirding, Feb 01 '16 at 13:53
Thank you Robert. I am assuming you are only referring to the ets.lookup code segment. Not the ets.first,next snippet as well above? This is very helpful. What then is the appropriate match or select clause for key retrieval? — bibekp, Feb 01 '16 at 17:46
Yes, using first/next means that you will access every element, though it does mean that you maybe able to only copy the keys. — rvirding, Feb 05 '16 at 14:12

score 2 · Answer 3 · answered Feb 01 '16 at 08:00

2

For getting a list of keys in ets without touching their values you can use the combination of ets:first/1 and ets:next/2 functions this way:

-export([keys/1]).

keys(TableName) ->
    FirstKey = ets:first(TableName),
        keys(TableName, FirstKey, [FirstKey]).

keys(_TableName, '$end_of_table', ['$end_of_table'|Acc]) ->
    Acc;
keys(TableName, CurrentKey, Acc) ->
    NextKey = ets:next(TableName, CurrentKey),
    keys(TableName, NextKey, [NextKey|Acc]).

The exported API is keys/1. It gets the table name, fetches the first key of it, initiates an accumulator as state and internally calls keys/2 which fetches other keys and accumulating them in a recursive manner.

Note that bag tables type do not have order, so if your table type is bag the return value of keys/1 wouldn't be ordered.

answered Feb 01 '16 at 08:00

Hamidreza Soleimani

2,504
16
19

Since bags are allowed to have duplicate keys but with different values, will this return all the duplicate keys? – bibekp Feb 01 '16 at 18:35
I verified that it does return only the unique keys – bibekp Feb 02 '16 at 04:19
@Heron28 Actually there is no duplicate key in `bag` and `duplicate_bag` tables. The `bag` tables can have many unique values per key and `duplicate_bag` tables can have many duplicated values per key. – Hamidreza Soleimani Feb 02 '16 at 05:05

How to retrieve a list of ets keys without scanning entire table?

3 Answers3