Plan of the Course^
In this course, we will begin with a review of linear regression models, hypothesis testing, and maximum likelihood estimation, followed by four sections focusing on specific topics.
0. Revision^
0.1 Revision of Linear Regression Models
In this course we will start by revising the least squares method that estimates the coefficients by minimizing the sum of the squared residuals (differences between observed and predicted values).
0.2 Revision of Hypothesis testing
After estimating coefficients, statistical inference involves testing hypotheses about variable relationships, typically using t-tests, confidence intervals, and model fit checks, with variance often estimated using analytical formulas. Alternatively, the bootstrap method, a resampling technique useful for estimating the sampling distribution, can assess variability of estimates, especially when traditional assumptions are not met.
0.3 Maximum Likelihood Estimation (MLE) is a method used to estimate the parameters of a statistical model. It finds the parameter values that maximize the likelihood function, which measures how well the model explains the observed data.
After this general introduction, the course is devided into 4 sections:
- Qualitative Dependent Variable Models (QDVM)
- Limited Dependent Variable Models (LDVM)
- Panel Data
- Structual Equation Modeling (SEM)
1. Qualitative Dependent Variable Models^
These models are used when the dependent variable is categorical rather than continuous.
1.2 Polychotomous Ordered Models: For ordered categorical outcomes (e.g., ordered probit or logit models).
1.3 Polychotomous Unordered Models: For outcomes with multiple categories without a natural order (e.g., multinomial logit models).
1.4 Count Models: For dependent variables that represent counts (e.g., Poisson regression).
2. Limited Dependent Variable Regression^
Limited dependent variable regression models are used when the range of the dependent variable is constrained or limited.
2.2 Selection: Selection models address biases that arise when the sample is not randomly selected from the population (e.g., Heckman selection model).
3. Panel Data^
Panel data involves observations on multiple entities (such as individuals, firms, or countries) over time. This data structure allows for more complex analyses that account for both cross-sectional and temporal variations.
3.1 Pooled Models: Combine data across entities and time, ignoring individual effects.3.2 Fixed-Effects Models: Control for time-invariant characteristics of the entities, allowing for entity-specific intercepts.
3.3 Random Effects Models: Assume that entity-specific effects are random and uncorrelated with the independent variables.
3.4 Conditional Fixed-Effects Logit: Suitable for binary outcomes, controlling for entity-specific effects.
3.5 Difference-in-Differences: A method used to estimate causal effects by comparing changes over time between a treatment group and a control group.
4. Introduction to Structural Equation Modeling (SEM)^
Structural Equation Modeling (SEM) is a statistical technique that combines factor analysis and multiple regression. It allows for the modeling of complex relationships between observed and latent (unobserved) variables.
4.1 Latent Variables: Unobserved variables inferred from observed data, typically used to represent abstract concepts like intelligence or socioeconomic status.4.2 Multilevel Modeling: Extends SEM to account for data that is nested or hierarchical, such as students within schools or employees within firms, allowing for both within-group and between-group variations.
- Docente: Verardi Vincenzo