Course for international guest/part time students
- Faculty
- Faculty of Economics
- Organization
- Faculty of Economics
- Code
- GTI10AN002EN
- Title
- Data Analysis in the Social Sciences
- Usual semester
- Spring
- Published semester
- 2024/25/2
- ECTS
- 3
- Language
- en
- Learning outcomes
- This course will cover how regression analysis can be used to identify causal relationships. Much of the presentation will follow a policy evaluation perspective. We will cover the implicit assumptions underlying each research design, read published papers that implement each design, and will see examples of how to execute each strategy using the freely available statistical software package . Prerequisite: ECONOMETRICS (Linear Regression) The primary research designs we will learn in this course, and the other objectives are: 1.Random Assignment 2.Fixed Effects, First Difference 3.Differences in Differences (and triple difference) 4.Instrumental Variables 5.Regression Discontinuity 6.To learn how to critically evaluate empirical research 7.Be introduced to the basics of R This course will be conducted 100% in English. You will be required to read excerpts of academic journal articles published in English. While we will use R, you will leave the class with just basic skills. Instead, you will have been exposed to the abilities of R and will be ready to continue working and learning how to use R if you wish.
- Course content
- Prerequisite: This course builds heavily on Econometrics. To successfully complete this course you must have already passed econometrics and have a strong understanding of the mechanics of OLS regression. I will assume knowledge equivalent to roughly the first eight chapters of Introductory Econometrics: A Modern Approach by Wooldridge. You will struggle in this course without a strong foundation. In this course you will be introduced to the R programming language for statistics. R is very powerful and very flexible. There are times that you will be required to run R on your laptop during scheduled class meetings. For this reason, you must brign a laptop with R and RStudio installed to each class. Due to facility limitations, you can not rely on being able to plug in your computer in the classroom. You are therefore expected to bring your laptop to class completely charged. Course Content · Material from the lectures and seminars · Video sessions will be uploaded to Panopto and can be accessed via the ELTE Moodle platform. · Extra reading posted on Moodle. Preliminary Course Schedule: You should periodically check the lecture timetable. · Topic 1 – Potential Outcomes Framework and Random Assignment o What is the fundamental challenge when conducting causal inference? Rubin’s Causal Model. Random Assignment as the gold standard. · Topic 2 – OLS and Fixed Effects o OLS as weighted average of matched comparisons. Fixed Effects estimation. Clean vs Dirty variation. · Topic 3 – Difference in Differences (DD) and Triple Difference (DDD) o Implicit assumptions of the DD strategy. Tests for validity. Examples. · Topic 4 – Instrumental Variables (IV) o Assumptions of IV estimator. Fundamental untestability of instrument validity. External Validity. Examples. · Topic 5 – Regression Discontinuity (RD) o Assumptions of RD design. External validity limitations. Examples.
- Assessment method
- The two 45-point midterm tests will be scheduled centrally by GTK and will be administered in the exam center. I hope for these to be long midterms where you have ample time to critically evaluate the questions when constructing your answers. There will be 2-4 opportunities to earn up to 10 additional points through quizzes/activities during class meeting times (these may be unannounced). Midterm points and quiz points will be added together and capped at 100 to determine if you will be offered a preliminary course grade and if you will be offered an opportunity to take the final exam. I will follow the GTK standard grade cut-offs when determining grades and final exam eligibility. No additional extra credit will be offered.
- Bibliography
- Textbooks: I will not be following one specific book per se, but the following free book will be considered our required text as it complements the lecture material nicely: Causal Inference: The Mixtape, by Scott Cunningham. This book is freely available online and includes detailed example Stata and R code for many of the topics we will be discussing. The books below are not explicitly free but they can be acquired cheaply online. These are not required at all, but may serve as a useful reference for the interested student: Mostly Harmless Econometrics (MHE), by Angrist & Pischke. This is a very common book used by advanced undergraduates and grad students around the world. Mastering ‘Metrics, by Angrist & Pischke. This is an undergraduate version of MHE, and the exposition may help in understanding the intuition. Econometric Analysis of Cross Section and Panel Data, by Wooldridge. This is more of a reference and is used in most PhD econometrics courses that cover panel methods. This is a much more advanced book than we need; do not buy this unless you are wealthy and enjoy having fancy looking textbooks on you bookcase. One of the strengths of R is the enormous amount of user created scripts and content devoted to learning and using R. I will leverage this content so we can focus more on the econometrics. Here are some excellent software tutorials to get you started in R Installation: https://www.youtube.com/watch?v=TFGYlKvQEQ4 General Tutorials: https://campus.datacamp.com/courses/free-introduction-to-r/ https://app.datacamp.com/learn/courses/introduction-to-regression-in-r Tutorials: https://rpubs.com/phle https://www.econometrics-with-r.org/index.html