I have a large data set, dedicated to biological journals, which was being composed for a long time by different people. So, the data are not in a single format. For example, in the column "AUTHOR" I can find John Smith, Smith John, Smith J and so…
I have a transposition that I'd like to apply to multiple columns. The Grel generated shows the columnName or Base name, but that means I have to edit the code for each column. Thought there was a way to find the column index and have code that…
OpenRefine http://openrefine.org/ allows URL generation using GREL as tokens. I want to connect to an API which only supports a POST method . Can I format the URL so it calls the REST API using POST?
Ref:…
I am unable to replace null values in cells. I have created a facet to only display cells that have null values. I then went to edit cells > Transform function and tried to use the replace function but it does not seem to be working.
Different…
I am trying to extract a sequence of numbers from a column in Google Refine. Here is my code for doing it:
value.match(/[\d]+/)[0]
The data in my column is in the format of
abcababcabc 1234566 abcabcbacdf
The results is "null". I have no idea…
Im trying to create a new column which contains true or false. Basically column A has a number in it, between 1 and 6, if its higher than 3 I want the new column 'match' to contain true, otherwise it contains false. Using the add column based on…
I have been cleaning a table on Open Refine. I now have it like this:
REF Handle Size Price
2002, 2003 t-shirt1 M, L 23
3001, 3002, 3003 t-shirt2 S, M, L 24
I need to split those multivalued…
I know the question is asked already but somehow I can't find any convincing solution after googling for about an hour.
I am using apache-jena to load RDF model from a url. And I am getting IncompatibleClassChangeError with following message
Class…
I'm using openrefine to cleanup an excel data set. I have about 70 operations and I've been cutting and pasting on different data sets. I maintain a record id and export to a new excel sheet. Then I reload the sheet using the record id.
It works…
How to use the opencorp API?
For instance
According to the website:
The Open Refine Reconciliation API allows OpenRefine users to match company names to legal corporate entities. This is especially useful when you have an existing spreadsheet or…
There is not much to add to the title. It's what i'm trying to do. Any suggestions?
I reviewed the docs at github and googled extensively.
The best i got is:
value.parseHtml().select('p[contains('xyz')]')
It results in a syntax error.
I'd like to search and replace multiple values in a column with a single function with GREL (or anything other) in Google Refine.
For example:
1. replace(value, "Buch", "bibo:Book")
2. replace(value, "Zeitschrift", "bibo:Journal")
3. replace(value,…
I've got some JSON within Google Refine - http://mapit.mysociety.org/point/4326/0.1293497,51.5464828 for the full version, but abbreviated it's like this:
{1234: {'name': 'Barking', 'type': 'WMC'},
5678: {'name': 'England', 'type': 'EUR'} }
I only…
I have a data set with 30 columns and multiple rows (some cells have no data). I would like to be able to facet the columns in groups.
1 2 3 4...
Row1 A B C D
Row2 E A D F
Row3 Q A B H
Given the above data I would like the facet to retun…
I have a large corpus of text data that I'm pre-processing for document classification with MALLET using openrefine.
Some of the cells are long (>150,000 characters) and I'm trying to split them into <1,000 word/token segments.
I'm able to split…