I know some people have already asked about this but I would like to clarify the best way to weight the data. I have a fairly basic understanding of weighting so I hope my question makes sense.

I am working on a project in which the sample is anyone that has taken part in waves a, b or c of the Covid study. Most of our outcomes look at changes between baseline (Jan-Feb) and any point across the first three COVID waves e.g. did the participant loose their job or loose hours at either waves a, b or c compared to baseline. They only need to have taken part in one of the first three waves to be included. We have also linked wave 8 and 9 responses from the main understanding society wave for additional demographic information e.g. marital status, industry worked in pre-covid and household earnings (pre-covid). We used wave 9 information in the first instance and then included wave 8 for those who didn’t provide information in wave 9.

With the cross sectional weights am I right in thinking that I could use the latest weight provided e.g. if a participant took part in waves a and c I could use the wave c cross sectional weight? Basically could I create a weight variable that uses the latest weight provided for that individual? Some of our outcomes are only based on questions asked in waves a-b (e.g. most of the financial module questions). If I took the approach above in some instances I could end up weighting based on the cross sectional weight provided with c (if that was the latest survey an individual had participated in) even though the questions were asked in waves a-b. Would I need to create a different weighting variable for each of my questions? I am not sure if I am complicating this and perhaps it would be OK not to weight it?

I would appreciate your guidance.

Best wishes,

