Support #1075

Weights in an unbalanced panel

Added by Liat Raz-Yurovich over 5 years ago. Updated over 5 years ago.

Start date:
% Done:



We have a question regarding the appropriate weights to apply when analyzing combined files from BHPS and Understanding Society (US). We aim to run a linear fixed-effects model on an unbalanced panel, meaning that we are following individuals over time, but we do not have the same number of observations (waves) for each individual, and also individuals may enter and exit at different waves. For example, we would like to include individuals who entered during different waves of BHPS (some of whom may continue to US), as well as individuals who entered in US.

We read through the correspondence in Issue #414 from 3 years ago, However, it is not clear to us that the solutions suggested in that issue can be applied in our case because we would like to have as long a time frame as possible and as many individuals as possible (i.e. we do not want to limit ourselves to US waves, or to take only individuals who entered in BHPS).

We understand that since the panel is unbalanced, it is not appropriate to apply the longitudinal weights. Is that correct?

If not the longitudinal weights, then which weights would you suggest?

Are there any other related issues that we should consider (e.g. scaling)?

Thank you.


Updated by Stephanie Auty over 5 years ago

  • Category set to Weights
  • Status changed from New to In Progress
  • Assignee set to Olena Kaminska
  • Target version set to X M
  • Private changed from Yes to No

Many thanks for your enquiry. The Understanding Society team is looking into it and we will get back to you as soon as we can.

Best wishes,
Stephanie Auty - Understanding Society User Support Officer


Updated by Olena Kaminska over 5 years ago

Are you thinking to represent people or events?


Updated by Liat Raz-Yurovich over 5 years ago

Olena Kaminska wrote:

Are you thinking to represent people or events?

We want to represent people


Updated by Olena Kaminska over 5 years ago

My suggestion to you would be to use W_XXXXXub_lw weight from the last wave that you have observation for any one person - so if you have even one person with an information from wave 7 -you should use wave 7 weight. This will limit whom you can use - for example you wouldn't be able to look at BHPS people who dropped out before UKHLS. Note that a sizable proportion of people who participated in BHPS but dropped out before UKHLS have died or moved out of the country since. You may not want to include thesepeople into the population that you study.

You have to be careful about the subgroups you study: you can look at any substantive subgroups, e.g. immigrant females, people who lost job and got another one within 12 months etc. What you can't do is select subgroups based on some survey design criteria, like people who responded in waves 4 to 6. In other words if you use weights but limit to some people based on the waves they responded - your analysis will not be representative simply because there is no such population group as 'people who responded in wave 4-6'.

Hope this helps,


Updated by Stephanie Auty over 5 years ago

  • Status changed from In Progress to Feedback
  • Assignee changed from Olena Kaminska to Liat Raz-Yurovich
  • % Done changed from 0 to 70

Updated by Stephanie Auty over 5 years ago

  • Status changed from Feedback to Resolved
  • % Done changed from 70 to 100

Also available in: Atom PDF