Support #1590
Creating a weight variable for fixed effects analysis
100%
Description
Dear UKHLS support,
Hope you are well.
May I ask for your advice on whether you think weighting should be applied in a fixed-effects analysis?
I would like to use data from Waves 2, 4, 6, 8, and 10. From my understanding, if we use the longitudinal weight provided in the Wave 10 data (j_indscub_lw), it will only retain the sample who completed waves from Waves 2 to 10, which could lead to a substantial drop in the sample size.
I have read the paper "Weighting and Sample Representation: Frequently Asked Questions", which was really useful and clear, and learnt that we can derive our own weight variable. However, I am not sure how to do so with a longitudinal dataset (in a long Stata format).
I would be very grateful to have your advice on (i) whether or not weighting is necessary for a fixed-effects model, and (ii) if so, what would be the best way to derive a weight variable for this study.
Best wishes,
Karen
Updated by Understanding Society User Support Team 8 months ago
- Status changed from New to Feedback
- % Done changed from 0 to 50
- Private changed from Yes to No
(1) This is an analysis question. This is a good discussion about why and when we should weight. http://jhr.uwpress.org/content/50/2/301.refs
(2) Please email us at usersupport@understandingsociety.ac.uk and we will send you the guidance for producing your own weights.
Updated by Olena Kaminska 8 months ago
Karen,
Yes, weighting is necessary in fixed effect models.
Look for similar questions about pooled analysis on this forum - they may answer your question.
If you need more help, please give us details on what you want to estimate (e.g. people or events etc.)
Thanks,
Olena
Updated by Karen Mak 8 months ago
Hi Olena,
Thanks so much for your prompt reply and for your help.
I tried to look for similar questions on this forum, but couldn't seem to find any solutions.
I am interested in how changes in volunteering behaviours are associated with changes in wellbeing. Questions on volunteering were asked in alternative waves (i.e. Waves 2, 4, 6, 8 & 10). I have now merged/append all relevant waves into one dataset in a long format.
(1) I'd like to ask whether it would be more appropriate to use the wave 10 longitudinal weight (j_indscub_lw), or to create a specific weight for the analysis?
(2) If creating a new weight is more appropriate here, what would be the steps to create one in a long data format? Do I need to reshape it to wide format -> then create a binary variable "response" (1=completed all 5 waves, 0=only completed Wave 2) -> run a logistic regression predicting "response" using predictors (e.g. age, gender, martial status) on a condition that participants not known to have died/emigrated and that participants have a Wave 2 weight value greater than 0 -> then generate a new weight = gen weightW25 = (1/p)*b_indscus_lw ?
Thank you and best wishes,
Karen
Updated by Olena Kaminska 7 months ago
Karen,
There is no particular reason to create your own weight.
To help you select a weight, please explain how you analyse the data: are you looking at a change between 2 waves, or 3 waves at a time?
Thanks,
Olena
Updated by Olena Kaminska 7 months ago
Karen,
Do you mean you study change between wave 2 and wave 10?
Or do you mean you study change between wave 2 and 4, and separately between 4 and 6 etc.
What is your outcome variable?
Thanks,
Olena
Updated by Olena Kaminska 7 months ago
Karen,
In this situation use longitudinal weight for wave 4 for the set of waves 2 to 4 outcome, lw weight for wave 6 for the set of waves 4 to 6 outcome etc. In a long format you will create a new weight variable and give it respective values.
Hope this helps,
Olena
Updated by Understanding Society User Support Team 5 months ago
- Status changed from Feedback to Resolved
- Assignee deleted (
Olena Kaminska) - % Done changed from 90 to 100