[PART 1] Cleaning Data with OpenRefine
Data sets often have problematic content such as formatting inconsistencies and incomplete elements that are beyond the capabilities of spreadsheet programs like Excel to deal with easily. OpenRefine is a powerful tool for working with messy data. It will help you do things like standardize date formatting; split up cells with multiple authors into separate cells; match local data up to other data sets; and enhance a data set with data from other sources.
Part 1 of this workshop will focus on the basic functions of OpenRefine.
Part 2 will focus on advanced features like fetching data from APIs, reconciling data by comparing it to external data sets, and looking at extentions to OpenRefine to perform other tasks.