Reconciliation services in OpenRefine allow you to look up items from your data in other sources online that maintain controlled lists of terms, such as the VIAF (Virtual International Authority File) or Wikidata. This allows you to update the data in your source using the controlled terms from these sources, which they can also link out to for maximum interoperability. This example will show how to reconcile values from the Other_Diagnosis column against Wikidata.
After reconciling, two new facets appear: Other_Diagnosis: judgment and Other_Diagnosis: best candidate’s score. The judgment facet shows which values have been matched. Near “matched,” there may also be a value, “none,” if you ran the Reconciliation over multiple rows and some items failed to match. As you make matches between your values and the reconciled terms, the number in ‘none’ will go down and the ‘matched’ number will go up. The best candidate’s score facet has to do with how well the values matched against those from the online authority file, based on fuzzy matching.
Notice the check-marks next to the choices for matched values back in the original Other_Diagnosis column. Two related terms besides “asthma” were found in Wikidata through the reconciliation process. To complete the reconciliation, any of these terms can be selected. Clicking the single-check box will accept the answer for one cell. Clicking the multi-check box will accept the new value for all cells in which the original value appeared.
Keep in mind that reconciliation will take much longer to run if it is run over the entire spreadsheet instead of just one row of data. For more on OpenRefine Reconciliation services, see the official OpenRefine GitHub page’s guidance on Reconcilable Data Sources.