Support #258

High number of 0 weights

Added by Jan Eichhorn about 10 years ago. Updated about 10 years ago.

Start date:
% Done:



Dear support team,

I am conducting an analysis where I want to compare life-satisfaction means (n_sclfsato) between particular groups from waves 2 and 3. The selected respondents are all classified as unemployed in wave 3 (c_jbstat). I compare the results for different groups depending on their economic activity in wave 2(c_ff_jbstat). Don't knows etc. are excluded.

When I compute my analyses unweighted I get just under over 2000 valid respondents (2026), who are unemployed in wave 3 and for whom I have the data on economic activity from wave 2. When taking into account my key variables (b_sclfsato from wave 2 and c_sclfsato from wave 3) I still retain 1669 respondents for whom data is available at both instances.

However, when I now apply weights the number of respondents drops substantially. As I am dealing with the data from both waves I assume I need to use longitudinal data. As the life-satisfaction questions stem from the self-completion section I assume I should use the respective weights - i.e. c_indscus_lw. When I do this however, the estimated number of cases stands at 1237. I wanted to check whether this is actually accurate (i.e. a reflection that those unemployed are oversampled in the dataset) or whether there might be some mistake in my thinking here. If it is the former - could you advise me of an appropriate way forward? An over-representation of those unemployed would not be a problem for my analysis - as I am only focussing on them, so would there be a possibility to adjust the weights for other factors but not have the figures changed so substantially (which has a massive impact on the results - in terms of direction of means comparisons).

Thank you for your advice.


Updated by Jan Eichhorn about 10 years ago

Dear support team - just a short note: The title is misleading (I first misread the problem, please ignore the title)
Thank you and best,


Updated by Redmine Admin about 10 years ago

as for #259 we would like a bit of time to get the best possible answer for you.


Updated by Olena Kaminska about 10 years ago


Thank you for your question. The drop of cases you observe is because with the c_indscus_lw you lose cases from BHPS part of the sample. If your analysis includes only wave 2 and wave 3 then we recommend you use the weight for BHPS+UKHLS combined samples (c_indscUB_lw). If you use waves 1 to 3 then you will have to use c_indscUS_lw weight though.

Hope this helps,


Updated by Redmine Admin about 10 years ago

  • Target version set to X M
  • % Done changed from 0 to 50

Updated by Redmine Admin about 10 years ago

  • Status changed from New to Closed
  • % Done changed from 50 to 100

Also available in: Atom PDF