How to find the keys of the largest values in Hash R?

Question

I have a hash which look likes the example and I create this hash using hash package.

How can I return keys of maximum values in R ?

input hash table :

h<-hash( keys=c(1,4,5,6),values=c(30,25,25,30) )
 # <hash> containing 3 key-value pair(s).
 #  1 : 30
 #  4 : 25 
 # 5 : 25
 # 6 : 30

what is your original structure? A data.frame? If so what are the name of the columns? — Colonel Beauvel, May 11 '15 at 15:44
@ColonelBeauvel, sorry I don't understand ,what do you mean by original structure?I create this hash from two other hashes, and I could say dataframe like structure. — academic.user, May 11 '15 at 15:52
the figures you gave. They are contained in a variable/object. What is the name of this object? From what you gave this is not a basic R structure, so you must give additional detail when you ask a question ... — Colonel Beauvel, May 11 '15 at 15:53
@ColonelBeauvel,this is a `hash`,`H`, I create this with `hash package`documented [here](http://cran.r-project.org/web/packages/hash/hash.pdf) — academic.user, May 11 '15 at 15:53
@ColonelBeauvel is saying that you have only provided output. Typically, if you are asking for help from us volunteers, you should make it as easy for us as possible. One method ([strongly encouraged](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example)) is to provide a MWE using actual input that we can see, copy, use, etc. — r2evans, May 11 '15 at 16:34
@academic.user: You are declaring your hash incorrectly. You should not need to use `.set` directly. I have edited your response to use the `hash` function directly. The use of `keys` and `values` is not needed. Generally `hash` does the right thing. — ctbrown, May 12 '15 at 06:44
@ctbrown,what happen is there any side effects if we use `.set`? — academic.user, May 12 '15 at 09:28
No there are not any side-effects, but `.set` is not a defined part of the interface so there is no guarantee that it won't change in future revisions of the package. Other methods in the package are unlikely to change either in behavior or syntax. — ctbrown, May 12 '15 at 10:30

bergant · Answer 1 · 2015-05-11T16:26:09.577

2

For simple values (vectors of length 1) this works:

H <- hash(a = 5, b = 2, c = 3, d = 5)
H

# <hash> containing 4 key-value pair(s).
#   a : 5
#   b : 2
#   c : 3
#   d : 5

val <- unlist(as.list(H))  # convert to list and to named vector
names(val[val == max(val)])

# [1] "a" "d"

edited May 11 '15 at 16:26

answered May 11 '15 at 16:18

bergant

7,122
1
20
24

@bergant: Be careful here. Hashes do not require values to be scalar/atomic values. Hash values can be recursive (non-atomic) objects. In this case, the solution though accurate for the question won't work generally. – ctbrown May 12 '15 at 06:48
Please see my elaboration on @bergant's answer below. – ctbrown May 12 '15 at 07:15

score 1 · Accepted Answer · answered May 12 '15 at 07:14

Full disclosure: I authored and maintain the hash package.

Unless you have a hash with many key-value pairs and need the performance, standard R vectors with names will likely be a better solution. Here is one example:

v <- c(a = 5, b = 2, c = 3, d = 5)
names( v[ v==max(v) ] )

Native R vectors will outperform hashes until the structure grows beyond ~200 key-value pairs. (It is been a while since I benchmarked hash, vector and list lookup performance).

If a hash fits the solution, the answer by @bergant solves the OP's questions, though please understand it is rather dangerous. Converting a hash to a list and then using unlist ignores the fact that hash values are not constrained to be scalar/atomic values. They can be any R object. Consider:

 > hash(a = 1:5, b = 2, c = 3, d=5)
 <hash> containing 4 key-value pair(s).
 a : 1 2 3 4 5
 b : 2
 c : 3
 d : 5

You can decide whether this is a problem for your application or not.

A simpler, higher performing and more general approach is to use the 'values' function. In the simple case where all values are scalar/atomic values, this closely mirrors @bergant's solution.

H <- hash(a = 5, b = 2, c = 3, d = 5)
val <- values(H)     # Compare to `unlist(as.list(H))`
names( val[ val == max(val) ] )

Since values returns a named list rather than an unlisted, we are set up for the more general solution since we can select a value to compare from each key value pair:

H <- hash(a = 1:5, b = 2, c = 3, d=5)
val <- values(H)

# Alternate 1: Compare min from each value
val <- sapply(val, max )

# Alternate 2: Compare first element from each value 
# val <- sapply(val, function(x) x[[1]])

names( val[ val == max(val) ] )

I hope that helps.

:would you please take a look at [this](http://stackoverflow.com/questions/30165162/writing-a-r-hashtable-in-to-a-csv-file?noredirect=1#comment48441273_30165162) question as well. — academic.user, May 12 '15 at 09:50
@ctbrown Hi, Is it possible to use vectors and names to match a pair of keys with a single value? and if using the hash package is a better option for such a scenario? Thank in advance. — savi, Mar 17 '20 at 13:38
@shana, so you are trying to use multiple keys to identify a given hash value? — ctbrown, Mar 17 '20 at 23:52
@ctbrownI am not sure the value can be called a hash value. Let me provide some context here. So,my requirement is, there are scores assigned to a pair of identities, say for example, a pair of names get assigned a one score. I want to be able to store this in the form of ( name1, name2)-> 4 , (name2,name5)-> 5 etc in a datastructure from which I can later extract (keys,value) based on lowest value or highest value in O(logn) time. The size of this datastructure will not exceed 100 or 200. I was thinking a max-heap or a min-heap. But I am not sure if R has the corresponding implementations. — savi, Mar 18 '20 at 09:30

How to find the keys of the largest values in Hash R?

2 Answers2