Working with dictionaries/lists to get list of keys

Question

I have trivial question: I couldn't find a dictionary data structure in R, so I used list instead (like "word"->number). So, how do I get the list of keys.

score 130 · Accepted Answer · answered May 18 '10 at 14:17

130

Yes, the list type is a good approximation. You can use names() on your list to set and retrieve the 'keys':

> foo <- vector(mode="list", length=3)
> names(foo) <- c("tic", "tac", "toe")
> foo[[1]] <- 12; foo[[2]] <- 22; foo[[3]] <- 33
> foo
$tic
[1] 12

$tac
[1] 22

$toe
[1] 33

> names(foo)
[1] "tic" "tac" "toe"
>

answered May 18 '10 at 14:17

Dirk Eddelbuettel

360,940
56
644
725

21

+1 for answering question without a word about ineffective approach of OP. – Marek May 19 '10 at 12:47
5

Depending on the intended use of a list as a proxy for a dictionary, it might be prudent to keep in mind that "key" lookup for lists is O(n) rather than O(1), which is what you'd expect for a dictionary (which hashes keys). – egnha Aug 26 '18 at 22:04
5

Yes, the `environment` type is used for that in R, but it less common / less known. – Dirk Eddelbuettel Aug 26 '18 at 22:07

score 64 · Answer 2 · answered May 19 '10 at 11:55

64

You do not even need lists if your "number" values are all of the same mode. If I take Dirk Eddelbuettel's example:

> foo <- c(12, 22, 33)
> names(foo) <- c("tic", "tac", "toe")
> foo
tic tac toe
 12  22  33
> names(foo)
[1] "tic" "tac" "toe"

Lists are only required if your values are either of mixed mode (for example characters and numbers) or vectors.

For both lists and vectors, an individual element can be subsetted by name:

> foo["tac"]
tac 
 22

Or for a list:

> foo[["tac"]]
[1] 22

answered May 19 '10 at 11:55

Calimo

7,510
4
39
61

1

How can you get the list `c(12,22,33)` of out this dictionary-style R structure foo? `unlist(lapply(FUN=function(a){foo[[a]]},X = 1:length(foo)))` is very inconvenient. Any ready function for this? Moved the question [here](https://stackoverflow.com/questions/45460273/working-with-dictionaries-lists-in-r-how-to-get-the-values-from-the-key-value-p) – hhh Aug 02 '17 at 12:02

score 20 · Answer 3 · edited Aug 28 '19 at 16:26

To extend a little bit answer of Calimo I present few more things you may find useful while creating this quasi dictionaries in R:

a) how to return all the VALUES of the dictionary:

>as.numeric(foo)
[1] 12 22 33

b) check whether dictionary CONTAINS KEY:

>'tic' %in% names(foo)
[1] TRUE

c) how to ADD NEW key, value pair to dictionary:

c(foo,tic2=44)

results:

tic       tac       toe     tic2
12        22        33        44

d) how to fulfill the requirement of REAL DICTIONARY - that keys CANNOT repeat(UNIQUE KEYS)? You need to combine b) and c) and build function which validates whether there is such key, and do what you want: e.g don't allow insertion, update value if the new differs from the old one, or rebuild somehow key(e.g adds some number to it so it is unique)

e) how to DELETE pair BY KEY from dictionary:

foo<-foo[which(foo!=foo[["tac"]])]

Can I add key that contains spaces, something like 'strange key'? — user1700890, Mar 23 '17 at 16:45
Also something like this does not work `c(foo, tic2=NULL)`. Any work around? — user1700890, Mar 23 '17 at 17:14

vonjd · Answer 4 · 2019-01-29T09:11:33.520

The reason for using dictionaries in the first place is performance. Although it is correct that you can use named vectors and lists for the task the issue is that they are becoming quite slow and memory hungry with more data.

Yet what many people don't know is that R has indeed an inbuilt dictionary data structure: environments with the option hash = TRUE

See the following example for how to make it work:

# vectorize assign, get and exists for convenience
assign_hash <- Vectorize(assign, vectorize.args = c("x", "value"))
get_hash <- Vectorize(get, vectorize.args = "x")
exists_hash <- Vectorize(exists, vectorize.args = "x")

# keys and values
key<- c("tic", "tac", "toe")
value <- c(1, 22, 333)

# initialize hash
hash = new.env(hash = TRUE, parent = emptyenv(), size = 100L)
# assign values to keys
assign_hash(key, value, hash)
## tic tac toe 
##   1  22 333
# get values for keys
get_hash(c("toe", "tic"), hash)
## toe tic 
## 333   1
# alternatively:
mget(c("toe", "tic"), hash)
## $toe
## [1] 333
## 
## $tic
## [1] 1
# show all keys
ls(hash)
## [1] "tac" "tic" "toe"
# show all keys with values
get_hash(ls(hash), hash)
## tac tic toe 
##  22   1 333
# remove key-value pairs
rm(list = c("toe", "tic"), envir = hash)
get_hash(ls(hash), hash)
## tac 
##  22
# check if keys are in hash
exists_hash(c("tac", "nothere"), hash)
##     tac nothere 
##    TRUE   FALSE
# for single keys this is also possible:
# show value for single key
hash[["tac"]]
## [1] 22
# create new key-value pair
hash[["test"]] <- 1234
get_hash(ls(hash), hash)
##  tac test 
##   22 1234
# update single value
hash[["test"]] <- 54321
get_hash(ls(hash), hash)
##   tac  test 
##    22 54321

Edit: On the basis of this answer I wrote a blog post with some more context: http://blog.ephorie.de/hash-me-if-you-can

Does it work for multvalued relations? For example tic=1 and tic=17 — skan, Jul 24 '17 at 18:16
Using this approach in place of using lists with names took my running time down from 6 mins to 1 sec! I understand hashes fine, but can anyone confirm when looking up a name in a list what sort of search algo is used? Is this just iterating through the list under the name matches? I'd like to understand exactly why lists are so slow, as well as why hashes are so fast for large number of keys? — Phil, Oct 13 '19 at 12:25
@vonjd I am trying to use dictionary in R and found this implementation. However, does it also work when each value is associated with a pair of keys? Thank you in advance. — savi, Mar 17 '20 at 13:31
@vonjd Sure, (name1,name2)->value, (name2,name9)->value. I am looking for a ds that I can store such pairs ( max 200 of them ) and extract lowest or highest in logn time. — savi, Mar 18 '20 at 09:32

score 11 · Answer 5 · answered Mar 11 '17 at 18:17

11

The package hash is now available: https://cran.r-project.org/web/packages/hash/hash.pdf

Examples

h <- hash( keys=letters, values=1:26 )
h <- hash( letters, 1:26 )
h$a
# [1] 1
h$foo <- "bar"
h[ "foo" ]
# <hash> containing 1 key-value pair(s).
#   foo : bar
h[[ "foo" ]]
# [1] "bar"

answered Mar 11 '17 at 18:17

Ngọc Linh Vũ

111
1
4

How can you add multiple values? I've tried with repeating the key but it only stores the last value. I've also tried assigning lists but it doesn't work – skan Jul 24 '17 at 18:27
Dictionaries never store multiple values per key. You can assign a list to a key if you want. – BallpointBen Jan 29 '19 at 15:22
Such a good way of doing this! Replicates a lot of the functionality and setup that dictionaries have and is incredibly easy to implement. – Peter Maguire Apr 30 '21 at 01:12

score 8 · Answer 6 · answered Apr 17 '16 at 19:37

Shorter variation of Dirk's answer:

# Create a Color Palette Dictionary 
> color <- c('navy.blue', 'gold', 'dark.gray')
> hex <- c('#336A91', '#F3C117', '#7F7F7F')

> # Create List
> color_palette <- as.list(hex)
> # Name List Items
> names(color_palette) <- color
> 
> color_palette
$navy.blue
[1] "#336A91"

$gold
[1] "#F3C117"

$dark.gray
[1] "#7F7F7F"

score 4 · Answer 7 · answered May 21 '15 at 15:45

4

I'll just comment you can get a lot of mileage out of table when trying to "fake" a dictionary also, e.g.

> x <- c("a","a","b","b","b","c")
> (t <- table(x))
x
a b c 
2 3 1 
> names(t)
[1] "a" "b" "c"
> o <- order(as.numeric(t))
> names(t[o])
[1] "c" "a" "b"

etc.

answered May 21 '15 at 15:45

Gabriel Perdue

1,553
2
15
23

I don't think `as.numeric()` is necessary. The table is already numeric. You can get the same result with `names(t[order(t)])` – Rich Scriven May 21 '15 at 15:48

Working with dictionaries/lists to get list of keys

7 Answers7

Linked