2

I'm looking for some help to create subsets using %like% operator in R.

I have a table called 'pruebas1', which contains this information:

      scenario_name | land_consumption | land_consumption_pct
Contención al 30%      692.00              11.081468525813
Contención al 50%      221.23               3.542703786613
Contención al 70%       94.98               1.520975451494
Contención al 95%       69.29               1.109583760966

And more rows. They share a pattern, the percentage value '30%', '50%'

I want to create a subset for each percentage value, and I tried to do it with this code:

for (i in 1:33){
  if (prueba1$scenario_name %like% '%30%'){
    esc_30[[i]]<-prueba1$scenario_name[[i]]
  }
} 

The result is an object with no data. I built this with a friend and we are new to this. As you can see we need help first to use correctly the %like% operator and of course make a loop to create a subset for the different percentages values.

You can help us with specific links or help with the code directly.

Brian Tompsett - 汤莱恩
  • 5,753
  • 72
  • 57
  • 129
  • 2
    I'm adding the data.table tag since that's the only place I've seen %like%. Feel free to revert the edit and clarify if using something else. – Frank Oct 12 '18 at 19:17
  • Related, possible duplicate of: https://stackoverflow.com/questions/35822899/how-to-str-extract-percentages-in-r – zx8754 Oct 12 '18 at 19:19
  • `like` uses a patter passed to `grepl`, so when you use a pattern of `'%30%'`, the percent sign is an unescaped delimiter. Try changing it to `'30\\%'`. The `like` is also vectorized, so the first element determines if the `if` evaluates to `TRUE` or `FALSE`. Perhaps what you want to do is `esc_30[prueba1$scenario_name %like% '30\\%'] <- prueba1$scenario_name[prueba1$scenario_name %like% '30\\%']` – Kerry Jackson Oct 12 '18 at 19:26
  • 1
    Sounds like you just want `split(pruebas, pruebas$scenario_name)`, or do you have different strings that contain `30%` that you want grouped together? – Gregor Thomas Oct 12 '18 at 19:34
  • @KerryJackson I used your code and that create a list which has the total number of rows, but only rows with '30\\%' have data. Instead, I ran a modification of your code `esc_30<- prueba1$scenario_name[prueba1$scenario_name %like% '30\\%']` And that gives me a list just with the rows that contain '30\\%' - no empty rows-. I want that but with the rest of the columns – Judá García Oct 12 '18 at 20:54
  • @Gregor yeah, that's what I want – Judá García Oct 12 '18 at 20:57

2 Answers2

2

You're probably thinking of the SQL LIKE operator, where x LIKE '%foo%' means any values that contain 'foo' in any position.

The equivalent for the data.table %like% would be x %like% ".*foo.*". This is because %like% works with regular expressions. In a regular expression, the string .* means "any character repeated 0, 1 or multiple times".

In R, see ?regex for how R handles regular expressions.

Hong Ooi
  • 56,353
  • 13
  • 134
  • 187
0

If you want to avoid using regexp you should use "fixed" argument in the grepl. The %like% in data.table is a wrapper for grepl.

So, you could try something like:

esc30<-prueba1$scenario_name[grepl("30%",prueba1$scenario_name,fixed=T)]

If you want to get all the columns:

esc30<-prueba1[grepl("30%",prueba1$scenario_name,fixed=T),]

However, if you want to not subset items containing "30%" in the middle of the text, you should learn regexps.