Skip to Main Content

Essential Spreadsheet Data Cleaning with OpenRefine

This guide accompanies the Galter Health Sciences Library class of the same name, or can be used on its own to learn a few essential data cleaning functions of the open source application OpenRefine.

Install the RDF Extension

Steps to Install the RDF Extension

Screenshot showing the Extensions page on the OpenRefine website

  • On the RDF extension GitHub page, download the latest release of the extension: RDF Extension v1.4.0
  • Navigate to the folder on your machine where you have stored your downloaded OpenRefine files
  • Navigate to the folder openrefine-3.7.7/webapp/extensions
  • In extensions, create a new folder titled rdf-extension

Screenshot showing the subfolder that must be created for the RDF extension

  • Find your downloaded RDF extension zip file. Click to Extract All, and when asked for a destination folder navigate to the new rdf-extension folder you created in your OpenRefine extensions. Extract to this location.

Screenshot showing the location of the downloaded RDF extension

  • Close the OpenRefine application and re-start it to load the new extension.