Support #2281
openWeights for longitudinal analysis - high zero-weighted numbers; and selecting weights
90%
Description
Hi,
I'm carrying out some longitudinal analysis and have two questions about the longitudinal weights. For context: I'm analysing employment transitions using consecutive wave pairs (e.g., job characteristics at Wave 4 predicting economic (in)activity at Wave 5; job charcteristics at Wave 6 predicting economic (in)activity at Wave 7, e.t.c). At this initial stage of analysis I'm actually going to try to analyse each pair independently rather than as a continuous panel, partly to boost sample size.
Firstly, for each pair (4-5, 6-7, 8-9, 10-11, 12-13), I'm wondering if I should in fact use the longitudinal weight for that specific wave combination (e.g., indscub_lw* for waves involving 2-5, and indscui_lw for later pairs)? The weighting documentation might be read as implying that I should use only one consistent weight, but as I'm grouping with pairs I actually think this wouldn't make sense. Instead, I propose to use different weights depending on the pairs, simply renaming the appropriate weight depending on the wave pair into a consistent combined name so I can insert them into the survey design.
Secondly, I notice there are quite a lot of zero-weighted individuals in my wave pairs. When filtering to those in paid work at the even-numbered waves, and who respond to both that wave and the wave after, about 40% of values seem to be zero weighted in each pair. This contrasts with a considerably smaller number of zero-weighted values for the cross-sectional version of the weight (i.e. indscub/ui_xw). Is this correct? I wanted to clarify this before proceeding as that seems like quite a lot of values to lose, so perhaps it's more apporpriate for me to create my own bespoke weight so as to not lose the values - though unsure if the process for that would be too complex and time-consuming.
Best wishes,
Tom
- Note I'm also using self-completion health questionnaire data for some analysis, so am using the self-completion weight. I don't this is the cause for the large number of zero-weighted values.