Project

General

Profile

Support #2083

Creating a tailored longitudinal weight for unbalanced panel data

Added by Isabelle Munier about 1 month ago. Updated about 1 month ago.

Status:
In Progress
Priority:
Normal
Category:
-
Start date:
04/10/2024
% Done:

0%


Description

Hello,

I am performing a multilevel growth curve model, using STATA:s mixed command, in which repeated measures of wages (level 1) are nested within individuals (level 2). I am using waves 6, 7, 8, 9, 10, 11, 12 and 13, since I'm particularily interested in migrant groups. The model that I have chosen is flexible in the sense that it allows for partially missing data and unbalanced panels. My aim is to estimate the effect of over-education on wages, and how this differs by migrant status and gender, over time. It is not a pooled cross-sectional analysis, since I'm interested in individual trajectories over time. I understand that I must use weights to correct for unequal selection probabilities, non-response and attrition, however, the longitudinal weights provided by Understanding Society require balanced panels. Since the mixed command in STATA is not compatible with the svyset command, and pweights must be positive to perform the analysis, there is no other way than dropping everyone that did not participate across all waves. In my case, if I use the appropriate longitudinal weight m_indinui_lw, my sample size is reduced from approximately 8000 respondents to 3000 respondents. As I am doing three-way interactions in my model, it requires sufficiently large sample sizes in each group. I see that my estimates from weighted analysis (using m_indinui_lw) and non-weighted analysis are fairly similar, however the estimates turn insignificant when applying weights, plausibly due to the small sample sizes in each group. I have looked through previous inquiries regarding weights and unbalanced panels here in the support forum, and looked through the training material on how to create your own tailored weights. However, I have not found a solution to my specific problem. In the example given in the Open Essex course, Module 5, it is shown how to create a longitudinal weight for responses at waves a, d and g specifically. In this case, one can predict probabilites of participating in wave a, d, and g conditional on non-zero weights in wave a (adding relevant covariates). In my case, the only requirement is that the individuals have participated in at least three waves, since Growth curve modeling requires a minimum of three repeated measures to estimate growth curves. When I have attempted to predict probabilites of participating in all my included waves conditional on non-zero weights in wave 6 I still end up with many 0 weights... would greatly appreciate any recomendations on how to proceed!

Also available in: Atom PDF