Occasionally in data wrangling a column must be redefined or reconfigured in order to store data in a more logical way. In this exercise we’ll demonstrate how to do that by transposing columns. In the sample dataset there are three dimension-based columns, P_Height, P_Width, and P_Depth. They could be reconfigured to result in two columns, Dimension and Measurement, with the specific dimension referred to spread out over new rows. Although this may seem confusing, the new rows will remain associated with their original records.
Steps to Transpose Columns into Rows
- From the All drop down arrow choose Edit Columns – Reorder/remove columns and drag P_Height, P_Width, and P_Depth to the top
- In the OpenRefine main screen, toggle over to records mode so that OpenRefine understands that we will be applying these changes to entire records, not individual rows.
- Make sure that the P_Height, P_Width, and P_Depth columns are all next to each other, as this is necessary for Transposition to work.
- From the drop down arrow at the top of the left-most column (in this case, P_Height), choose Transpose, then Transpose cells across columns into rows.
- The measurement-related columns in the project have now been reduced, from three to two.
- Although new rows have been added, they are still associated with their original records, as can be seen by the shading cues OpenRefine provides in record mode. The record numbers that OpenRefine assigns are also a cue that no new records have been created.
- The results can also be viewed in rows mode. However be aware that this will make the dataset appear to expand, as rows are created and all the non-dimensional data is repeated to go with each instance of height, width, and depth. When transposing, it’s best to keep the project in record mode.
- Note: Transposing in the opposite direction, from rows into columns, is also possible. See the questions, answers, and examples available on Stack Overflow: https://stackoverflow.com/questions/49677016/transposing-rows-in-openrefine