Project

General

Profile

Actions

Support #2281

open

Weights for longitudinal analysis - high zero-weighted numbers; and selecting weights

Added by Thomas Stephens 2 months ago. Updated 24 days ago.

Status:
Feedback
Priority:
Normal
Category:
Weights
Start date:
09/18/2025
% Done:

90%


Description

Hi,

I'm carrying out some longitudinal analysis and have two questions about the longitudinal weights. For context: I'm analysing employment transitions using consecutive wave pairs (e.g., job characteristics at Wave 4 predicting economic (in)activity at Wave 5; job charcteristics at Wave 6 predicting economic (in)activity at Wave 7, e.t.c). At this initial stage of analysis I'm actually going to try to analyse each pair independently rather than as a continuous panel, partly to boost sample size.

Firstly, for each pair (4-5, 6-7, 8-9, 10-11, 12-13), I'm wondering if I should in fact use the longitudinal weight for that specific wave combination (e.g., indscub_lw* for waves involving 2-5, and indscui_lw for later pairs)? The weighting documentation might be read as implying that I should use only one consistent weight, but as I'm grouping with pairs I actually think this wouldn't make sense. Instead, I propose to use different weights depending on the pairs, simply renaming the appropriate weight depending on the wave pair into a consistent combined name so I can insert them into the survey design.

Secondly, I notice there are quite a lot of zero-weighted individuals in my wave pairs. When filtering to those in paid work at the even-numbered waves, and who respond to both that wave and the wave after, about 40% of values seem to be zero weighted in each pair. This contrasts with a considerably smaller number of zero-weighted values for the cross-sectional version of the weight (i.e. indscub/ui_xw). Is this correct? I wanted to clarify this before proceeding as that seems like quite a lot of values to lose, so perhaps it's more apporpriate for me to create my own bespoke weight so as to not lose the values - though unsure if the process for that would be too complex and time-consuming.

Best wishes,

Tom

  • Note I'm also using self-completion health questionnaire data for some analysis, so am using the self-completion weight. I don't this is the cause for the large number of zero-weighted values.
Actions

Also available in: Atom PDF