Create subset with %like% operator

Question

I'm looking for some help to create subsets using %like% operator in R.

I have a table called 'pruebas1', which contains this information:

      scenario_name | land_consumption | land_consumption_pct
ContenciÃ³n al 30%      692.00              11.081468525813
ContenciÃ³n al 50%      221.23               3.542703786613
ContenciÃ³n al 70%       94.98               1.520975451494
ContenciÃ³n al 95%       69.29               1.109583760966

And more rows. They share a pattern, the percentage value '30%', '50%'

I want to create a subset for each percentage value, and I tried to do it with this code:

for (i in 1:33){
  if (prueba1$scenario_name %like% '%30%'){
    esc_30[[i]]<-prueba1$scenario_name[[i]]
  }
}

The result is an object with no data. I built this with a friend and we are new to this. As you can see we need help first to use correctly the %like% operator and of course make a loop to create a subset for the different percentages values.

You can help us with specific links or help with the code directly.

I'm adding the data.table tag since that's the only place I've seen %like%. Feel free to revert the edit and clarify if using something else. — Frank, Oct 12 '18 at 19:17
Related, possible duplicate of: https://stackoverflow.com/questions/35822899/how-to-str-extract-percentages-in-r — zx8754, Oct 12 '18 at 19:19
`like` uses a patter passed to `grepl`, so when you use a pattern of `'%30%'`, the percent sign is an unescaped delimiter. Try changing it to `'30\\%'`. The `like` is also vectorized, so the first element determines if the `if` evaluates to `TRUE` or `FALSE`. Perhaps what you want to do is `esc_30[prueba1$scenario_name %like% '30\\%'] <- prueba1$scenario_name[prueba1$scenario_name %like% '30\\%']` — Kerry Jackson, Oct 12 '18 at 19:26
Sounds like you just want `split(pruebas, pruebas$scenario_name)`, or do you have different strings that contain `30%` that you want grouped together? — Gregor Thomas, Oct 12 '18 at 19:34
@KerryJackson I used your code and that create a list which has the total number of rows, but only rows with '30\\%' have data. Instead, I ran a modification of your code `esc_30<- prueba1$scenario_name[prueba1$scenario_name %like% '30\\%']` And that gives me a list just with the rows that contain '30\\%' - no empty rows-. I want that but with the rest of the columns — Judá García, Oct 12 '18 at 20:54

score 2 · Answer 1 · answered Oct 12 '18 at 19:47

You're probably thinking of the SQL LIKE operator, where x LIKE '%foo%' means any values that contain 'foo' in any position.

The equivalent for the data.table %like% would be x %like% ".*foo.*". This is because %like% works with regular expressions. In a regular expression, the string .* means "any character repeated 0, 1 or multiple times".

In R, see ?regex for how R handles regular expressions.

Vladimir Volokhonsky · Answer 2 · 2018-10-15T15:01:09.777

If you want to avoid using regexp you should use "fixed" argument in the grepl. The %like% in data.table is a wrapper for grepl.

So, you could try something like:

esc30<-prueba1$scenario_name[grepl("30%",prueba1$scenario_name,fixed=T)]

If you want to get all the columns:

esc30<-prueba1[grepl("30%",prueba1$scenario_name,fixed=T),]

However, if you want to not subset items containing "30%" in the middle of the text, you should learn regexps.

Create subset with %like% operator

2 Answers2