Skip to Main Content

Essential Spreadsheet Data Cleaning with OpenRefine

This guide accompanies the Galter Health Sciences Library class of the same name, or can be used on its own to learn a few essential data cleaning functions of the open source application OpenRefine.

Open a New Project

Once OpenRefine is downloaded, you can click on the blue diamond icon to launch it. First a small Java window will appear, which can be minimized and kept running in the background. Next the program itself will load in your default browser window.

Screenshot showing how to open an OpenRefine project from a file

You can open files stored on your computer in many different formats, including CSV, Excel, JSON, and XML. If you've downloaded the CSV file for these exercises from Google Drive, you can browse and open that file here.

You can also type in Web addresses of files. When your files have been selected or typed in, click Next.