Skip to Main Content

Data Cleaning with Excel

Through this guide you will learn tricks and tips to better clean and organize tabular data using Microsoft Excel.

COUNTIF Formula

The example scenario in this section of the guide is drawn from the class 10 Simple Rules for Data Cleaning with Excel, and utilizes a small subset of the National Electronic Injury Surveillance System dataset from 2022. This dataset compiles information from patient visits for emergency treatment for injuries involving commercial products. In the dataset's "Narrative" column, the patient's experience is recorded. In this column, the shorthand "C/O" is used in place of the words "complaining of". Let's imagine for this scenario that we would like to replace each instance of "C/O" with "complaining of" for greater clarity.

As a first step in this multi-part cleaning task, we'll find out how many rows in the spreadsheet (the small sample version used in the class) use the shorthand "C/O".

COUNTIF Instructions

  • Select a column (Narrative_1 in this example), then choose an empty cell beneath it
  • From the Formulas tab, choose "Insert Function" and search for "COUNTIF"
  • Enter a range of cells over which to count (this can also by done by clicking first in the Range box and then in the corresponding cells in the spreadsheet, but be careful of unintended autofills)
  • Enter the criteria that you would like Excel to count over (Excel will count a cell if the criteria is true). Use the format "*text*" to search for a string of characters no matter where it occurs in a cell. In our example, the criteria is "*c/o*"
  • A total for your count appears in the cell where you placed the formula

Screen image of application of the COUNTIF function in Excel

In our COUNTIF exercise on a small subset of the NEISS data, the total numbers of rows that use the "C/O" shorthand in the Narrative field is 634. In the next exercise, we'll use advanced filters to filter for only the rows that use this shorthand, so we can carry out data cleaning just on these rows.


References:

1. US Consumer Product Safety Commission. NEISS Highlights, Data and Query Builder. Available at: https://www.cpsc.gov/cgibin/NEISSQuery/home.aspx