Skip to Main Content

Data Cleaning with Excel

Through this guide you will learn tricks and tips to better clean and organize tabular data using Microsoft Excel.

Simple Trims, Sorts, and Filters

Screen Clipping showing a column with the Excel TRIM formula, which will trim the extra white space from the characters that have been entered in the column to the left.

 

Trim Instructions:

  • Create a new column next to the one where whitespace needs to be trimmed
  • Either for the whole range, or cell-by-cell, enter =TRIM() with the cells you'd like to trim inside the parentheses

Screen Clipping showing the results of the Excel TRIM function for the cell selected in the previous screen clipping. An extra space was removed between the words pain and hip.

Trim will not fix space-related spelling errors. It will only look for instances of two spaces next to each other, or leading or trailing whitespace on strings of characters.

In smaller datasets, simple sorts will quickly show data entry errors in data points consisting of only a few words. Simple sorts will also bring entries beginning with unusual characters to the top.

Screen Clipping showing the result of a simple alphabetical sort of the spreadsheet on the Other Diagnosis column.

 

Simple Sort Instructions:

  • Select All by clicking the triangle in the upper left corner of the screen
  • Choose the Data Tab in the Ribbon, then Sort
  • In the dialog box that pops up, click "My data has headers"
  • From the "Sort By" drop-down menu, choose the column on which you'd like to sort the spreadsheet
  • Click OK

While a sort rearranges all the data in your spreadsheet based on the order imposed from your chosen column, a filter uses the criteria you provide to filter out all rows from the spreadsheet that don't meet the criteria. The rows remain hidden, and will re-appear once the filter is removed. Filters can be applied temporarily on one or various columns in order to hone in on specific slices of your data. If you create a slice using filters that you'd like to examine separately from the rest of the data, you can copy the filtered slice to another worksheet or to a new spreadsheet.

Simple filters are helpful for finding and fixing data entry errors and typos, making quick changes to subsets of your data, and spot-checking data.

Simple Filter Instructions:

  • In the Ribbon, go to the Data Tab and select "Filter"
  • Notice that a drop-down arrow has appeared at the top of each column

Screen clipping showing the funnel-shaped filter icon in the middle of the Data Tab on the Excel ribbon.

  • Click the drop-down arrow at the top of the column you'd like to filter by
  • In the dialog box that pops up, notice that each distinct value (string of characters) that has been entered in this column is a filter option
  • You can click the individual choices on and off, and you can also search for a string of characters
  • If all choices are selected by default, you can choose 'Unselect All'
  • After you have chosen filter terms and clicked OK, Excel will display only the rows with the terms you chose

Screen clipping showing the simple filter options on an Excel column. Each unique value in the column is available as an option for filtering, and either single or multiple values can be used for filtering.