Skip to Main Content

Cleaning Spreadsheet Data with OpenRefine

This guide accompanies the Galter Health Sciences Library class of the same name, or can be used on its own to learn the basic functions of OpenRefine. The class and guide are adapted from Library Carpentry OpenRefine, Copyright 2016-2019

Writing Transformations and GREL

Although common transformations are built in to OpenRefine, you can write a wide variety of different transformations using GREL, or General Refine Expression Language. Transformations can range from simple to very complex. It is not necessary to memorize them as long as you have access to a sample of GREL 'recipe books.' Links can be found at the bottom of this page.

Writing a Simple Transformation

Below is the GREL recipe version of the Title case transformation (changing the text of a value so that each word starts with a capital letter)

Screenshot showing Titlecase transformation in OpenRefine

  • From the top of a column choose Edit Cells, then Transform. A dialog box like that seen above will pop up.
  • In the expression box type value.toTitlecase()  The expressions are case-sensitive, and the word "value" should appear only once.
  • Notice that in the Preview window, you can see the effect that your transformation will have. If it looks correct, click OK.

Writing a Complex Transformation

In the Product_1 field of the sample dataset, we see product numbers with the prefix "NEISS" before every entry. To work with those numbers, we might want to strip away the prefix first.

  • From the top of the Product_1 column, choose Edit Cells, then Transform
  • In the dialog box, within the Expression box, type the function seen below: value.partition(smartSplit(value,"-")[1])[1]

Screenshot showing a GREL formula for splitting in OpenRefine

As the preview box shows, this split transformation works by using the hyphen in the value as the separator on which to split, and strips away everything before that separator, plus the separator itself. It is not necessary to memorize such transformations - users can search for specific transformation recipes online, or look through some of the recipe collections linked below.


OpenRefine Recipes

GREL Functions