Skip to Main Content

Cleaning Spreadsheet Data with OpenRefine

This guide accompanies the Galter Health Sciences Library class of the same name, or can be used on its own to learn the basic functions of OpenRefine. The class and guide are adapted from Library Carpentry OpenRefine, Copyright 2016-2019

Quick Tips

Most operations in OpenRefine start with the menu options that can be seen by clicking the drop down arrow at the top of each column. Generally in OpenRefine you'll transform data in one column at a time. However the drop down arrow next to 'All' on the far left of the screen allows you to perform operations affecting all rows or all columns.

Screenshot showing the options from the OpenRefine All menu

One of the most helpful features to know about from the All column is under Edit Columns and is called 'Reorder/remove columns.' By selecting this, you will be presented with a list of all the columns in the spreadsheet, listed in the same order from top to bottom that they appear left to right on the screen. Use this feature to drag columns toward the top if you'd like them farther left on the screen, since columns appearing on the right may be difficult to view.

Screenshot of the OpenRefine Reorder and Remove Columns menu