Creating a weight variable for fixed effects analysis
Dear UKHLS support,
Hope you are well.
May I ask for your advice on whether you think weighting should be applied in a fixed-effects analysis?
I would like to use data from Waves 2, 4, 6, 8, and 10. From my understanding, if we use the longitudinal weight provided in the Wave 10 data (j_indscub_lw), it will only retain the sample who completed waves from Waves 2 to 10, which could lead to a substantial drop in the sample size.
I have read the paper "Weighting and Sample Representation: Frequently Asked Questions", which was really useful and clear, and learnt that we can derive our own weight variable. However, I am not sure how to do so with a longitudinal dataset (in a long Stata format).
I would be very grateful to have your advice on (i) whether or not weighting is necessary for a fixed-effects model, and (ii) if so, what would be the best way to derive a weight variable for this study.
Updated by Understanding Society User Support Team 8 months ago
- Status changed from New to Feedback
- % Done changed from 0 to 50
- Private changed from Yes to No
Updated by Olena Kaminska 8 months ago
Yes, weighting is necessary in fixed effect models.
Look for similar questions about pooled analysis on this forum - they may answer your question.
If you need more help, please give us details on what you want to estimate (e.g. people or events etc.)
Thanks so much for your prompt reply and for your help.
I tried to look for similar questions on this forum, but couldn't seem to find any solutions.
I am interested in how changes in volunteering behaviours are associated with changes in wellbeing. Questions on volunteering were asked in alternative waves (i.e. Waves 2, 4, 6, 8 & 10). I have now merged/append all relevant waves into one dataset in a long format.
(1) I'd like to ask whether it would be more appropriate to use the wave 10 longitudinal weight (j_indscub_lw), or to create a specific weight for the analysis?
(2) If creating a new weight is more appropriate here, what would be the steps to create one in a long data format? Do I need to reshape it to wide format -> then create a binary variable "response" (1=completed all 5 waves, 0=only completed Wave 2) -> run a logistic regression predicting "response" using predictors (e.g. age, gender, martial status) on a condition that participants not known to have died/emigrated and that participants have a Wave 2 weight value greater than 0 -> then generate a new weight = gen weightW25 = (1/p)*b_indscus_lw ?
Thank you and best wishes,
Updated by Olena Kaminska 7 months ago
In this situation use longitudinal weight for wave 4 for the set of waves 2 to 4 outcome, lw weight for wave 6 for the set of waves 4 to 6 outcome etc. In a long format you will create a new weight variable and give it respective values.
Hope this helps,
Thanks so much for your reply and incredible help!
I think I might have misunderstood your question earlier. Just wanted to confirm with you that if I wanted to explore the changes from Wave 2 to 10, I should be using Wave 10 lw weight?