3

My data includes multiple columns that--for my purposes--are the same. In these places, I need to combine the values in multiple selected columns into a single column. For example, combine columns names1, names2, and names3 into a single column names.

Following guidance here, I'm attempting to create a new column which includes values from multiple columns, as shown in the images below:

Column options

Add column based on column X dialog box

I believe that my GREL here should combine the values in dc.contributor.author (simply value as this is the column from which I have selected Edit column > Add column based on this column), dc.contributor.authorEN_us (cells["dc.contributor.authorEN_us"].value), dc.contributor.author1 (cells["dc.contributor.author1"].value), and dc.contributor.authoren_US (cells["dc.contributor.authoren_US"].value).

However, my new column does not include values from these second, third, or fourth columns, even though I know that values exist.

Is there an error in my GREL syntax? Should I be using a different method to merge columns?

Thanks in advance for any assistance.

2 Answers2

5

I'm afraid the tutorial you've been looking at is both outdated and incomplete. The problem is that you have the value null in some columns. In OpenRefine, concatenation (i.e. joining two strings by a +) returns null if one of the values is null.

So:

"Hello" + " " + "World" gives "Hello World".

but

"Hello" + null + " " + "World" returns null.

There have been endless discussions about the problem. Until recently, I would have advised you to use:

either:

coalesce(value,cells['dc.contributor.author1'].value, cells['dc.contributor.authorEN_us'].value, cells['dc.contributor.authoren_US'].value)

(the coalesce function, introduced in Openrefine 3, returns the first non-null value in a series)

or:

value.toString() + cells['dc.contributor.authorEN_us'].value.toString() + cells['dc.contributor.authoren_US'].value.toString()

(Turning null into empty strings '')

But in the future OpenRefine 3.3 (now in bêta), thanks to @mathieu-saby, OpenRefine will have a menu allowing you to merge columns easily. You'll just have to update your version. :)

Screenshot of the Column Menu, with the submenu "Edit column" opened and the entry "Join columns..." highlighted.

Ettore Rizza
  • 2,800
  • 2
  • 11
  • 23
  • Thanks for this. Interestingly, My installed version is 3.2, yet I don't see the "Join columns" option. For the time being, I'll try using `toString()` as in your second example and see what results I can get. – BRiesenberg Jan 23 '20 at 17:47
  • 1
    Oops, you are right: this feature was introduced in 3.3, sorry : https://github.com/OpenRefine/OpenRefine/wiki/Changes-for-3.3 I'll edit my answer. – Ettore Rizza Jan 23 '20 at 17:55
0

I believe you are currently only getting the first value because you start the GREL with just "value". You would need to use the same syntax you use with the other columns you want to join. So your expression should look like this:

cells["dc.contributor.author"].value + cells["dc.contributor.authorEN_us"].value 
  + cells["dc.contributor.author1"].value + cells["dc.contributor.authoren_US"].value
showdev
  • 28,454
  • 37
  • 55
  • 73