Support #1632

Correct weighting for mental health

Added by Joe Lillis about 1 year ago. Updated 9 months ago.

Start date:
% Done:



Hello all,

Thanks for taking the time to read my post.

I've recently carried out an analysis on adolescent mental health, across three waves.

Study design as follows: Information on 16-21 year olds at Wave 6 (n=1,748)from indresp.dta, and again, at Wave 9.

Covariates on previous mental health and bullying were included from waves 1, 3 and 5 (n=1,073, or 59% of original sample)of the youth survey.

The outcome measure was GHQ-12 scores at wave 6 and 9.Have included the pdf of the paper as is for further information.

My question is, what weight to use? Do USoc weights account for attrition by mental health (GHQ-12)?

Happy to give further detail if needed!

All the best,


660052997_SHSM025.pdf (946 KB) 660052997_SHSM025.pdf Please keep private as this is unpublished work, which does not yet include USoc weights. Thanks. Joe Lillis, 01/17/2022 02:22 PM

Updated by Olena Kaminska about 1 year ago


Thank you for your question. The set of instruments you use is unusual and we don't have specific weight for your analysis. A possibly suboptimal weight would be i_psnenus_lw, but in this situation I do recommend creating your own tailored weights. Please request from usersupport the training material for this. Depending on how you analyse your data you may also be interested in our advice for pooled analysis (again, request this material from usersupport).

Hope this helps,


Updated by Understanding Society User Support Team about 1 year ago

  • Status changed from New to Feedback
  • % Done changed from 0 to 50
  • Private changed from Yes to No

Updated by Joe Lillis about 1 year ago

Thanks Olena,

The USoc email team responded, asking me to make this request here:

I need the training manual for creating tailored weights, as well as the manual for pooled analysis.

Thanks again,



Updated by Understanding Society User Support Team about 1 year ago

Yes, we asked you to continue to post your questions here. Thanks for doing that.

But to clarify you do need to email to request the training material for creating tailored weights, as we need your email address to send you the material (that is what Olena had suggested). As you have already written to us, we will email you the material.


Updated by Joe Lillis about 1 year ago

Thanks very much for your help,

Will come back to this with further questions re. tailored weights if required.

All the best,


Updated by Understanding Society User Support Team about 1 year ago

  • % Done changed from 50 to 80

Updated by Joe Lillis about 1 year ago

Hi Olena,

So having a look at the data today, and playing around with creating a tailored weight. To be clear again, this is how it looks:

1,748 individuals between age of 16-21 (wave 6) selected for study. (All 16-21 year olds).

1,708 have information on GHQ score at wave 6.

However, I am also including baseline mental health scores from youth surveys (SDQ scores across 2 variables, emotional problems(ypsdqes_dv) and peer relation problems (ypsdqpp_dv), collected at waves 1, 3 and 5.

Of 1,748 (all 16-21 year olds wave 6), 1,073 have data on both SDQ measures. I have a variable called 'cca' which codes those *with data * on SDQ scores (and all other covariates) = 1, and if missing on SDQ scores = 0. So CCA =0 (missing) or 1 (included).

So my question is, firstly, can I use i_psnenus_lw as a baseweight? Or would I use a weight from wave 1, where I gather data on youth SDQ scores?

It seems I need a baseweight to calculate from to account for UKHLS anyway.

Thanks again for taking the time to read and help out.



Updated by Olena Kaminska about 1 year ago


Have you listened to the full training material yet? Some of the answers will be there as well as an example for the UKHLS data that would be very useful for you.



Updated by Joe Lillis about 1 year ago

Good afternoon Olena,

I watched the videos yesterday, but have spent today going over them again and trying to wrap my head round the tailored weights. I think that my particular design is quite unusual(?), and don't necessarily relate directly to your examples (although they are still helpful in trying to understand).

If I attempt to capture non-response across all selected waves of study (youth waves 1, 3, 5 and adult waves 6 and 9) as the training suggests, I end up with <10% of my sample responding (n=143 out of a total 1,708).

This is because I have individuals moving from the youth questionnaire to the adult questionnaire, so it doesn't make sense to get a response (0/1) at all youth waves. So in terms of generating a resp variable, I'm quite stuck! Bear in mind that the youth data is not the outcome variable - it is a covariate associated with the outcome.

Similarly, I have been using a_ythscus_xw as a base weight - as it is technically the first time point at which some of my sample is captured. Again, I'm not sure this is correct despite being the lowest level of analysis. Because my outcome measure is GHQ-score at wave 9, and adjusting for GHQ-score at wave 6, I think I may be able to use the wave 6 weight (e.g. f_indinus_lw).

In some ways I feel I'm making progress in generally understanding, but with this particular study I'm getting a bit confused!

Thanks again,


Updated by Olena Kaminska about 1 year ago


Thank you for the details. I suggest you start with either i_psnenus_lw or i_indinus_lw as your base weights (assuming your last wave of analysis is wave 9). This will help you as mortality has already been taken into account - so use this as a start_wgt.
The simply follow the example excluding mortality. Your response is 0 for everyone and 1 for those who are in your model. When you weight your nonresponse model it will automatically condition on correct people - those who have a positive start_wgt.

You should be fine going from there. Don't worry if your nonresponse is large - your model should correct for this. If you start with i_psnenus_lw your predictors can come from any indall files before and including wave i. If you start with i_indinus_lw your predictors can come from any indall or indresp files.

Starting with a_ythscus_xw wouldn't be wrong but would be more complicated and less efficient in terms of predictors.

Hope this helps,


Updated by Understanding Society User Support Team 9 months ago

  • Status changed from Feedback to Resolved
  • % Done changed from 80 to 100

Also available in: Atom PDF