Project

General

Profile

Support #1652

Should using longitudinal weights lead to a balanced panel?

Added by Lucas Auer about 2 years ago. Updated almost 2 years ago.

Status:
Resolved
Priority:
High
Category:
Weights
Start date:
02/05/2022
% Done:

100%


Description

Dear Olena,

I have a question concerning the correct use of longitudinal weights. I want to perform longitudinal analysis in stata, starting in wave 6 and ending in wave 11, using information from the individual adult self-completion interview. I understand that I should therefore be using the weight k_indscui_lw.

When reading in my data, I create a panel data set in long format, removing the wave prefix from the variables and instead introducing a wave variable called UKHLSwave. Subsequently, I create a new weight variable corresponding to the respective individual’s value of k_indscui_lw for each observation (for later use in regression analysis) in the following way:
gen weight11_temp = indscui_lw if UKHLSwave == 11
by pidp: egen weight11 = max(weight11_temp)

My understanding from the weighting guidance (which I found very helpful – thanks a lot!) was that only individuals who gave a full interview at waves 6 through 11 should have a non-zero value of k_indscui_lw. In return, I expected to find my panel balanced for waves 6 through 11 once I condition on my new variable weight11 being non-zero and non-missing. However, when I do:
gen nonzeroweight = (weight11 > 0 & weight11 != .)
tab nonzeroweight UKHLSwave if UKHLSwave >= 6
I get strictly increasing numbers of observations from wave 6 through wave 11.

Can you please check if you can replicate my findings, and advise where I am doing/understanding something wrong? Otherwise, any insight into why it would be normal for the situation above to arise would be greatly appreciated.

Many thanks,
Lucas

Also available in: Atom PDF