Weights and accounting for individual clustering
I am carrying out logistic regression to estimate the association between transitioning from unemployment to employment and health service use. My sample contains all individuals who were unemployed at UKHLS W7. Transition to employment is captured using their employment status at W8 and health outcomes are measured at W9.
I have created an equivalent cohort using individuals who were unemployed at W6, and pooled the two cohorts to increase my sample size. I therefore have some clustering by pidp, as individuals contribute twice to the analysis if they were unemployed at both W6 and W7.
In Stata, I am using svyset to specify the psu, cross-sectional weight (indscui_xw) and the strata. How can I also account for clustering by pidp? Normally I would use logistic regression with option vce(cluster pidp), but this is not possible when using the svy:logistic command.
Updated by Olena Kaminska over 1 year ago
You only need psu as your clustering variable as it is the highest level (pidps are nested within psu), and in a straightforward logistic regression only psu needs to be indicated.
But reading your analysis description, I wonder if you need longitudinal weights. If you are using w6 - w9 information make sure you use w9 lw weight.
Updated by Catherine Bunting over 1 year ago
Hi Olena - thanks so much for the speedy reply, that's very helpful.
Just to clarify - if I have a group of individuals and am using information about them from waves 7, 8 and 9, I should use the W9 longitudinal weight, not the W7 cross-sectional weight?