A 'value' in a spreadsheet file is simply the contents of any given cell. A frequent need in data cleaning is to take a group of cells in which multiple values, or types of content, have been added, and break them out into new columns for each type of data. We'll demonstrate how to do this using a date field in the NEISS small subset spreadsheet. A full date consists of three combined values, day, month, and year, which could all be broken out.
Step 1: Convert a Date-formatted Cell to Text
- Treatment Date is an existing, date-formatted column in the NEISS small subset dataset. Next to it create a new column called 'Reformatted Date.' In this example, Treatment Date is column B and Reformatted Date is column C.
- In the second cell of this new column (directly under the header) type the formula: =TEXT(B2, "dd mm yyyy")
- A new, text-formatted version of the date has populated in cell C2. Notice that month and day have changed places, so now day is first.
- Double-click the plus sign at the bottom of cell C2 to copy this formula down the entire spreadsheet
- If you click into any of the cells in Column C, you'll see in the formula bar that Excel still thinks of the values of column C's cells as formulas. To replace the formulas with the values themselves, click on C2, then do Ctrl (Cmd in Mac) + Shift + down arrow. This selects all the values in column C. Click Ctrl (or Cmd) + c to copy all the values, then right click on C2, scroll down to Paste Special, and choose the Values option.
- Scroll back over some of column C's cells, and check the formula bar. They should now all appear as values, not formulas.