Skip to Main Content

Cleaning Spreadsheet Data with OpenRefine

This guide accompanies the Galter Health Sciences Library class of the same name, or can be used on its own to learn the basic functions of OpenRefine. The class and guide are adapted from Library Carpentry OpenRefine, Copyright 2016-2019

Undo/Redo Function in OpenRefine

The Undo/Redo tab next to the Facet tab in OpenRefine offers additional powerful features of this tool. By clicking Undo/Redo, you can see a list of every change and transformation made to the dataset since the project was created. If there is a step in the data wrangling process that should not have been done, you can click to the step immediately above it, and it will be erased. Keep in mind that everything below the step you click will be erased.

Screenshot showing Undo Redo Tab and JSON encoded operation history in OpenRefine

Just below Undo/Redo there are options to Extract and Apply. By clicking Extract, you can see JSON code representing every step that was taken as you cleaned your spreadsheet. This code can be copied and saved in a plain text editor (like Notepad) and applied to additional OpenRefine projects by using the Apply function and pasting it in. This is very helpful if you have a collection of similar data files that all require the same cleaning steps.