Steps for employing facets to identify and clean data-entry errors are outlined in Galter Library's introductory OpenRefine class, Cleaning Spreadsheet Data with OpenRefine. While much data cleaning can be done by faceting on one column at a time, it’s also possible to facet on two columns or more at once. You can apply these multiple facets manually to hone in on highly specified subsets of your data. You can even combine text-based and numeric facets at the same time. Using a GREL formula, you can achieve this in one step.
Facet on Multiple columns with a GREL Expression
- From the drop down arrow of a column in the sample dataset, choose Facet - Custom text facet. In this example we will use the Sex column and combine with a facet on Age, to facet the dataset on sex and age at once.
- In the Expression dialog box that pops up, enter the formula: join([ cells[“Col 1"].value, cells[“Col 2"].value ], ",") in the Expression box. Substitute the columns you want to facet together in the Col 1 and Col 2 spaces - in this case, Sex and Age. Inside the last quote marks after value, use any separator you would like to keep the two columns’ values visually separated.
- As the results show, using this custom facet formula allows us to hone in on male or female patients of particular ages.