Support #258

High number of 0 weights

Added by Jan Eichhorn over 9 years ago. Updated over 9 years ago.

Start date:
% Done:



Dear support team,

I am conducting an analysis where I want to compare life-satisfaction means (n_sclfsato) between particular groups from waves 2 and 3. The selected respondents are all classified as unemployed in wave 3 (c_jbstat). I compare the results for different groups depending on their economic activity in wave 2(c_ff_jbstat). Don't knows etc. are excluded.

When I compute my analyses unweighted I get just under over 2000 valid respondents (2026), who are unemployed in wave 3 and for whom I have the data on economic activity from wave 2. When taking into account my key variables (b_sclfsato from wave 2 and c_sclfsato from wave 3) I still retain 1669 respondents for whom data is available at both instances.

However, when I now apply weights the number of respondents drops substantially. As I am dealing with the data from both waves I assume I need to use longitudinal data. As the life-satisfaction questions stem from the self-completion section I assume I should use the respective weights - i.e. c_indscus_lw. When I do this however, the estimated number of cases stands at 1237. I wanted to check whether this is actually accurate (i.e. a reflection that those unemployed are oversampled in the dataset) or whether there might be some mistake in my thinking here. If it is the former - could you advise me of an appropriate way forward? An over-representation of those unemployed would not be a problem for my analysis - as I am only focussing on them, so would there be a possibility to adjust the weights for other factors but not have the figures changed so substantially (which has a massive impact on the results - in terms of direction of means comparisons).

Thank you for your advice.


Updated by Jan Eichhorn over 9 years ago

Dear support team - just a short note: The title is misleading (I first misread the problem, please ignore the title)
Thank you and best,


Updated by Redmine Admin over 9 years ago

as for #259 we would like a bit of time to get the best possible answer for you.


Updated by Olena Kaminska over 9 years ago


Thank you for your question. The drop of cases you observe is because with the c_indscus_lw you lose cases from BHPS part of the sample. If your analysis includes only wave 2 and wave 3 then we recommend you use the weight for BHPS+UKHLS combined samples (c_indscUB_lw). If you use waves 1 to 3 then you will have to use c_indscUS_lw weight though.

Hope this helps,


Updated by Redmine Admin over 9 years ago

  • Target version set to X M
  • % Done changed from 0 to 50

Updated by Redmine Admin over 9 years ago

  • Status changed from New to Closed
  • % Done changed from 50 to 100

Also available in: Atom PDF