0

Recently I found the way to create a simple "seeker" for a database using R. For example:

data(mtcars)
mtcars <- split(mtcars, list(mtcars$cyl, mtcars$hp))

Allow us to find the cars with 8 cylinders and 245 hp and also get the frequencies:

mtcars$"8.245"
nrow(mtcars$"8.245")

This works only for exact coincidences. For example:

>mtcars$"4.100"
NULL

So the question is: How this "seeker" may be improved in order to include operators such as >=, <=?

What I mean is, for example, to find all observations with 4 cylinders which have less or equal 100 hp

mtcars$"4.<=100" # This code obviously doesn't work it's just for show what I do want to achieve

Any advice will be much appreciated.

Alejandro Carrera
  • 513
  • 1
  • 4
  • 14
  • If you check the `4.100` ie. list element 31, it has 0 rows i.e. the reason you get `NULL` You may need `mtcars <- split(mtcars, list(mtcars$cyl, mtcars$hp), drop = TRUE)` to drop all unused cases – akrun Feb 24 '18 at 06:16
  • Did you meant `sum(do.call(rbind, mtcars[grep("^4", names(mtcars))])$hp <= 100)` For this you don't have to split. It can be directly done on the original dataset i.e. `data(mtcars);sum(mtcars$hp[mtcars$cyl==4] <=100)` – akrun Feb 24 '18 at 06:26
  • That's not exactly what I want (my goal is to use the "seeker"), but I assume there's no way to include <= or >= operators in any manner within the code I'm using. Thank you :) – Alejandro Carrera Feb 24 '18 at 06:35

1 Answers1

1

Absolute no idea why you would want to do this, but you can use eval and parse to create your own function to do that.

`%$%` <- function(df, seeker, cols=c("cyl","hp")) {
    strparts <- strsplit(seeker, "\\.")[[1]]
    sk <- ifelse(grepl("^[1-9]", strparts), paste0("==", strparts), strparts)
    s <- paste(paste0("df$", cols, sk), collapse="&")
    ans <- df[eval(parse(text=s)),]
    split(ans, ans[,cols])
} 

sample usage 1:

mtcars %$% "8.245"
# $`8.245`
#        mpg cyl disp  hp drat   wt  qsec vs am gear carb
# Duster 360 14.3   8  360 245 3.21 3.57 15.84  0  0    3    4
# Camaro Z28 13.3   8  350 245 3.73 3.84 15.41  0  0    3    4

sample usage 2:

mtcars %$% "4.<=100"
# $`4.52`
#              mpg cyl disp hp drat    wt  qsec vs am gear carb
# Honda Civic 30.4   4 75.7 52 4.93 1.615 18.52  1  1    4    2
# 
# $`4.62`
#            mpg cyl  disp hp drat   wt qsec vs am gear carb
# Merc 240D 24.4   4 146.7 62 3.69 3.19   20  1  0    4    2
# 
# $`4.65`
#                 mpg cyl disp hp drat    wt qsec vs am gear carb
# Toyota Corolla 33.9   4 71.1 65 4.22 1.835 19.9  1  1    4    1
# 
# $`4.66`
#            mpg cyl disp hp drat    wt  qsec vs am gear carb
# Fiat 128  32.4   4 78.7 66 4.08 2.200 19.47  1  1    4    1
# Fiat X1-9 27.3   4 79.0 66 4.08 1.935 18.90  1  1    4    1
# 
# $`4.91`
#               mpg cyl  disp hp drat   wt qsec vs am gear carb
# Porsche 914-2  26   4 120.3 91 4.43 2.14 16.7  0  1    5    2
# 
# $`4.93`
#             mpg cyl disp hp drat   wt  qsec vs am gear carb
# Datsun 710 22.8   4  108 93 3.85 2.32 18.61  1  1    4    1
# 
# $`4.95`
#           mpg cyl  disp hp drat   wt qsec vs am gear carb
# Merc 230 22.8   4 140.8 95 3.92 3.15 22.9  1  0    4    2
# 
# $`4.97`
#                mpg cyl  disp hp drat    wt  qsec vs am gear carb
# Toyota Corona 21.5   4 120.1 97  3.7 2.465 20.01  1  0    3    1# 
chinsoon12
  • 25,005
  • 4
  • 25
  • 35
  • I'm reading Norman Mattlof's "The Art of R Programming" and I'm just trying to go a little further in topics such as lists, classes, etc. Just for research and learning. Thank you for your awesome explanation. – Alejandro Carrera Feb 26 '18 at 18:39
  • Your code is quite complex and interesting so I'm reading about all functions involved, but I can't understand how does "seeker" argument is defined if there's no assignation within or outside the function. – Alejandro Carrera Feb 26 '18 at 19:21
  • for binary operator, you can use it in between arguments. for e.g `+`(2,3) == 2 + 3 (put backtick around +, i cant get backtick to show in comment here). same here for `%$%` – chinsoon12 Feb 27 '18 at 00:29
  • So basically, the argument "seeker" pinpoints to the binary operator %$%? – Alejandro Carrera Feb 27 '18 at 00:33
  • `%$%` is a function with 2 arguments, you can call it using either `%$%`(df, seekerString) or df `%$%` seekerString – chinsoon12 Feb 27 '18 at 00:34
  • Now seems pretty clair to me. Could you please suggest me any reference to learn more about binary operators? Thank you again! – Alejandro Carrera Feb 27 '18 at 00:36
  • haha good question: maybe start here? https://stackoverflow.com/questions/25179457/r-what-are-operators-like-in-called-and-how-can-i-learn-about-them – chinsoon12 Feb 27 '18 at 00:41
  • Thank you again. Norman Matloff's book is quite brief when talking about binary operatoros altough they're very useful to make things by your own. I appreciate your time and patience. – Alejandro Carrera Feb 27 '18 at 00:46