3

I'm trying to remove case-insensitively the value of cell 'artist' from the current cell (which is a song name). I know that replace() can take regex as argument (https://github.com/OpenRefine/OpenRefine/wiki/GREL-String-Functions#replacestring-s-string-f-string-r) and I can use (?i) for case-insensitive mode.

But how does replace() know whether its argument is a regex or plain string? All examples I've seen use /.../ to denote a regex, but I need to make a "dynamic" regex by concatenating the cell artist. So these don't work:

 value.replace('(?i)'+cells['artist'].value+,"")
 value.replace('((?i)'+cells['artist'].value+')',"")
 value.replace('/(?i)'+cells['artist'].value+'/',"")

I'd prefer to do this with GREL, but a solution with Python/jython will also work. Thanks!

dvalexieva
  • 31
  • 2
  • I'm not sure how, but you need to find someway to construct a regexp object since you need a regex which includes data from the row. What you are doing above will be a simple string replacement instead of regex match and replace – nhahtdh Dec 25 '19 at 10:42

2 Answers2

1

If you can work with Python, then you should be able to do something like this:

import re
regex = re.compile('(?i)'+cells['artist'].value)
return regex.sub('', value)

(I haven't checked if it actually works!)

pintoch
  • 2,293
  • 1
  • 18
  • 26
1

A more convenient way to set the "case insensitive" mode is by adding the i after the regex: value.replace(/Michael Jackson/i, "") (not sure if this feature is documented)

But this won't work with a variable like cells.artist.value. I don't know why. As Pintoch said, the easiest way is to go through Python/Jython with a script like:

import re
regex = re.compile(cells['artist'].value, re.I) #case insensitive

return regex.sub('', value)
Ettore Rizza
  • 2,800
  • 2
  • 11
  • 23