Skip to Main Content

Essential Spreadsheet Data Cleaning with OpenRefine

This guide accompanies the Galter Health Sciences Library class of the same name, or can be used on its own to learn a few essential data cleaning functions of the open source application OpenRefine.

Removing Duplicate and Blank Rows: a Three Part Exercise

Duplicate and/or blank rows of data can cause errors when feeding spreadsheets into analysis programs. In the next three sections, we will show OpenRefine's tools and methods for identifying duplicate rows of data, blank rows, and faceting by blank in order to remove blank rows.

Last Updated: Oct 9, 2024 11:25 AM
URL: https://libguides.galter.northwestern.edu/c.php?g=1075027
Print Page

Subjects: Data & Data Management

Northwestern University
Feinberg School of Medicine

Giving

Galter Health Sciences Library & Learning Center: 320 E. Superior Street, Chicago, IL 60611; 312-503-8126; Contact Us

Library Hours

Essential Spreadsheet Data Cleaning with OpenRefine

Removing Duplicate and Blank Rows: a Three Part Exercise

Northwestern University Feinberg School of Medicine

Northwestern University
Feinberg School of Medicine