ECO4300 Data Cleaning in STATA
Data cleaning is an essential step in the research process. In the real world, before econometric analysis can be conducted, data must be checked for accuracy and consistency. With increasing emphasis on replicability and transparency of economic research in academics, government and private industry, developing strong data cleaning skills is critical for professional development. This course will introduce you to the best-practices in data cleaning and coding. Together we will work through all steps needed to move from the initial importation of raw data to the point of being ready to conduct econometric analysis with those data. The class is completely project-based, with grades determined on the accuracy of your STATA program.
Prerequisite
Previous completion or concurrent enrollment in ECO 3300
Learning Outcomes
- 1. file organization and principles of replicability and transparency
- 2. creating names and labels
- 3. creating dichotomous variables from numeric and textual data
- 4. creating, describing and transforming numeric data
- 5. merging and reshaping datasets
- 6. dealing with dates and time
- 7. writing efficient do loops