After blanking down the duplicate CPSC_Case_Number values from the sample dataset, even if we take off the Duplicates facet, we know that within this table, the corresponding 8 records still have blank values for CPSC_Case_Number. Using OpenRefine's Facet by Blank feature, we can quickly identify the records that have blank values in the CPSC_Case_Number field and delete them.
Facet by Blank Steps
- From the CPSC_Case_Number column drop-down choose Facet – Customized facets – Facet by blank (null or empty string)
- In the resulting facet window on the left, you’ll see that there are 8 records that are “true” for having a blank value in the column CPSC_Case_Number. Choose “true” in this facet to isolate the 8 rows.
- From the “All” column drop-down towards the left, choose Edit rows – Remove matching rows.
- With the Facet by Blank still in place, you can see that OpenRefine just updated to show that now there are 0 rows that meet the criteria for being blank in the field CPSC_Case_Number. These rows have been removed, and our total number of rows has changed from 1511 to 1503.