2

I have a table in OpenRefine with columns A, B, and C like this:

A | B | C | D ---|---|---|--- a | 1 | b | 2 b | 2 | | c | 3 | a | 1 d | 4 | c | 3 I want to create a column D by fetching the values from B corresponding to those in C, using A as an index. Hope that makes sense? I'm not having much luck figuring out how to do this in GREL.

Will Hanley
  • 457
  • 3
  • 16

1 Answers1

4

You can use the 'cross' function to look up values across the project. Cross is usually used to look up values in a different OpenRefine project/file, but actually it works the same if you point it back at the same project you are already in.

So - from Col C, you can use "Add new column based on this column" with the GREL:

cell.cross("Your project name","Col A")

You'll get back an array of 'rows' - and if the same value appears in Column A multiple times you could get multiple rows.

To extract a value from the array you can use something like:

forEach(cell.cross("Your project name","Col A"),r,r.cells["Col B"].value).join("|")

The final 'join' is necessary to convert the array into a string which is required to be able to store the result (arrays can't be stored directly)

Owen Stephens
  • 1,550
  • 1
  • 8
  • 10
  • Thanks @owen-stephens, this is just what I needed. One minor edit: it should be `cross(cells,"Your project name","Col A")`. I used the second formula you gave, and cleaned up the results with `substring(value,0,37)`, because my 37-character result was repeated many times. – Will Hanley Oct 30 '17 at 20:41
  • Great. Generally I’d recommend using .unique() to remove duplicates from an array- you can insert it before the join() statement. But if it works it works :) – Owen Stephens Oct 30 '17 at 21:45
  • @OwenStephens Very smart solution ! – Ettore Rizza Oct 31 '17 at 10:08
  • Thanks both - updated answer to include the missing 'cell' reference noted by @WillHanley – Owen Stephens Oct 31 '17 at 10:50