Support #258
closedHigh number of 0 weights
100%
Description
Dear support team,
I am conducting an analysis where I want to compare life-satisfaction means (n_sclfsato) between particular groups from waves 2 and 3. The selected respondents are all classified as unemployed in wave 3 (c_jbstat). I compare the results for different groups depending on their economic activity in wave 2(c_ff_jbstat). Don't knows etc. are excluded.
When I compute my analyses unweighted I get just under over 2000 valid respondents (2026), who are unemployed in wave 3 and for whom I have the data on economic activity from wave 2. When taking into account my key variables (b_sclfsato from wave 2 and c_sclfsato from wave 3) I still retain 1669 respondents for whom data is available at both instances.
However, when I now apply weights the number of respondents drops substantially. As I am dealing with the data from both waves I assume I need to use longitudinal data. As the life-satisfaction questions stem from the self-completion section I assume I should use the respective weights - i.e. c_indscus_lw. When I do this however, the estimated number of cases stands at 1237. I wanted to check whether this is actually accurate (i.e. a reflection that those unemployed are oversampled in the dataset) or whether there might be some mistake in my thinking here. If it is the former - could you advise me of an appropriate way forward? An over-representation of those unemployed would not be a problem for my analysis - as I am only focussing on them, so would there be a possibility to adjust the weights for other factors but not have the figures changed so substantially (which has a massive impact on the results - in terms of direction of means comparisons).
Thank you for your advice.
Updated by Jan Eichhorn over 10 years ago
Dear support team - just a short note: The title is misleading (I first misread the problem, please ignore the title)
Thank you and best,
Jan
Updated by Redmine Admin over 10 years ago
as for #259 we would like a bit of time to get the best possible answer for you.
Jakob
Updated by Olena Kaminska over 10 years ago
Jan,
Thank you for your question. The drop of cases you observe is because with the c_indscus_lw you lose cases from BHPS part of the sample. If your analysis includes only wave 2 and wave 3 then we recommend you use the weight for BHPS+UKHLS combined samples (c_indscUB_lw). If you use waves 1 to 3 then you will have to use c_indscUS_lw weight though.
Hope this helps,
Olena
Updated by Redmine Admin over 10 years ago
- Target version set to X M
- % Done changed from 0 to 50
Updated by Redmine Admin over 10 years ago
- Status changed from New to Closed
- % Done changed from 50 to 100