Support #2110

Weights invalid stset

Added by Giovanni Greco 20 days ago. Updated 2 days ago.

In Progress
Start date:
% Done:



Hi all,
I am analysing survival data and I am using indinui_lw weights for people aged between 16 and 35 years old. However, when I stset my data, most of my weights happen to be invalid and Stata deletes them (weights already are constant within pidp). How can I solve this issue? Thank you very much in advance.


Updated by Giovanni Greco 19 days ago

I just managed to solve the issue of ivalidity by adding one conventional unit. Basically I generate a new variable, which equals my weight + 1. In this way, I don't have zeros anymore. Now stset works. However, I am in doubt whether adding that 1 unit is allowed, or whether instead it messes up the proportionality and function of weights. Thank you.


Updated by Understanding Society User Support Team 16 days ago

  • Category set to Weights
  • Status changed from New to In Progress
  • Assignee changed from Understanding Society User Support Team to Olena Kaminska
  • % Done changed from 0 to 10
  • Private changed from Yes to No

Many thanks for your enquiry. The Understanding Society team is looking into it and we will get back to you as soon as we can. We aim to respond to simple queries within 48 hours and more complex issues within 7 working days.


Updated by Olena Kaminska 10 days ago


Survival analysis is unique because it allows for nonresponse correction within the analysis (through truncation). Therefore you should provide the base weight (the weight at the start of your analysis), which should be non-zero for everyone in analysis, and let the survival analysis account for attrition. Please let the analysis know when each person dropped out (through truncation).

And yes, adding +1 to weight is wrong and will make then unrepresentative. You can multiply weights by any (positive) number, but you can't add something as indeed this will change their relative values (which is how they work).

Hope this helps,


Updated by Giovanni Greco 6 days ago

Good morning Olena,
thank you very much. Since my panel is unbalanced, would it make sense to use the three available base weights? (f_indinui_xw, b_indinub_xw, a_indinus_xw) In this case, I would use, for each individual, the oldest available weight they have. Therefore, if an individual enters the panel in wave 2, I will apply b_indinub_xw, and if the individual enter the panel in wave 6, I will apply f_indinui_xw. And if I understood well, then, if an individual enters the panel in wave 7, I won't have weights for that person and I will have to delete him/her. Right?

Thank you once again,


Updated by Olena Kaminska 2 days ago


No, you can't use _xw weights split for just some of the samples as they are created to represent the population conditional on full use of samples. Theoretically you could start with issue weights, but just a warning that UKHLS design is very complex and you would need to fully understand it before you can combine it correctly.
A shortcut could be: use wave 1 BHPS xw weight (1991 from 1991), with 1999 and 2001, add GPS+EMB wave 1 xw weight for GPS and EMB samples. Make sure to post-stratify by country as distributions will be wrong. Depending on how you do it, you would also need to post-stratify by new immigrants (the proportion of those that immigrated between 1991 and 2009 and their children born in this country should be correct, and separately those between 1999 and 2009, and 2001 and 2009 should also be correct). This is a simple approach.
It would be much more complicated to add IEMB boost as it was not designed to be used on its own and only in combination with other samples (wave 6 weights combines them together). But if you are not interested in ethnic groups and immigrants, you can just drop this sample. Or go through a complicated way of combining them, for which I would recommend reading sample design here as a starting point:

Hope this helps,

Also available in: Atom PDF