High number of 0 weights
Dear support team,
I am conducting an analysis where I want to compare life-satisfaction means (n_sclfsato) between particular groups from waves 2 and 3. The selected respondents are all classified as unemployed in wave 3 (c_jbstat). I compare the results for different groups depending on their economic activity in wave 2(c_ff_jbstat). Don't knows etc. are excluded.
When I compute my analyses unweighted I get just under over 2000 valid respondents (2026), who are unemployed in wave 3 and for whom I have the data on economic activity from wave 2. When taking into account my key variables (b_sclfsato from wave 2 and c_sclfsato from wave 3) I still retain 1669 respondents for whom data is available at both instances.
However, when I now apply weights the number of respondents drops substantially. As I am dealing with the data from both waves I assume I need to use longitudinal data. As the life-satisfaction questions stem from the self-completion section I assume I should use the respective weights - i.e. c_indscus_lw. When I do this however, the estimated number of cases stands at 1237. I wanted to check whether this is actually accurate (i.e. a reflection that those unemployed are oversampled in the dataset) or whether there might be some mistake in my thinking here. If it is the former - could you advise me of an appropriate way forward? An over-representation of those unemployed would not be a problem for my analysis - as I am only focussing on them, so would there be a possibility to adjust the weights for other factors but not have the figures changed so substantially (which has a massive impact on the results - in terms of direction of means comparisons).
Thank you for your advice.
Updated by Olena Kaminska over 6 years ago
Thank you for your question. The drop of cases you observe is because with the c_indscus_lw you lose cases from BHPS part of the sample. If your analysis includes only wave 2 and wave 3 then we recommend you use the weight for BHPS+UKHLS combined samples (c_indscUB_lw). If you use waves 1 to 3 then you will have to use c_indscUS_lw weight though.
Hope this helps,