The example scenario in this section of the guide is drawn from the class 10 Simple Rules for Data Cleaning with Excel, and utilizes a small subset of the National Electronic Injury Surveillance System dataset from 2022. This dataset compiles information from patient visits for emergency treatment for injuries involving commercial products. In the dataset's "Narrative" column, the patient's experience is recorded. In this column, the shorthand "C/O" is used in place of the words "complaining of". Let's imagine for this scenario that we would like to replace each instance of "C/O" with "complaining of" for greater clarity.
As a first step in this multi-part cleaning task, we'll find out how many rows in the spreadsheet (the small sample version used in the class) use the shorthand "C/O".
In our COUNTIF exercise on a small subset of the NEISS data, the total numbers of rows that use the "C/O" shorthand in the Narrative field is 634. In the next exercise, we'll use advanced filters to filter for only the rows that use this shorthand, so we can carry out data cleaning just on these rows.
References:
1. US Consumer Product Safety Commission. NEISS Highlights, Data and Query Builder. Available at: https://www.cpsc.gov/cgibin/NEISSQuery/home.aspx