Support #1305

Weighting for youth questionnaire

Added by Charlotte Edney over 3 years ago. Updated almost 2 years ago.

Start date:
% Done:




I have a question about sample weighting. I'm using a sub-sample of children who have experienced their parents separating/divorcing & this information I only have available for waves 3, 5 and 7 of USoc. I want to run a regression where my outcome of interest is the child response to a behavioural outcome question in the youth questionnaire. I am including the mothers characteristics as control variables e.g. whether she receivies child support, education levels, marital status etc. I also merge this with the chmain module - where the mother responds about each individual child identified by the childpno. My analysis is not exploiting the panel aspect of the data i.e. I treat each individual wave as a separate cross-section.

I am thinking of using the cross-sectional weights for each wave from the youth file W_ythscub_xw? Please could you advise me if this sounds right?



Updated by Alita Nandi over 3 years ago

  • Assignee set to Olena Kaminska
  • Private changed from Yes to No

Updated by Olena Kaminska over 3 years ago


Thank you for your question. I just want to clarify a few details of your analysis before answering the question.
Does the information about the mothers (whether she receives child support, education levels, marital status etc) and information from chmain come from the same wave as the information from youth questionnaire? If this information is from the same wave, and the information is available for all youth from the youth questionnaire then the W_ythscub_xw or W_ythscui_xw weight is the correct one to use.

If the information comes from only responding mothers, the W_ythscui(b)_xw will still be good, but it is worth checking how many of the young people have missing information from their mothers. If the proportion is high you may want to add additional nonresponse correction.

Hope this helps,


Updated by Charlotte Edney over 3 years ago

Hi Olena,

Thanks for the prompt and helpful reply.

To clarify, the information about mothers does come from the same wave as the information from the youth questionnaires. But only if the mother responds to the questions (I have merged the indall indresp youth and chmain files). There is a considerable amount of non-response, particularly in the chmain files so I suppose the additional non-response correction would be important, however I'm not at all familiar with it. Could you suggest which variables I would need to use? Or any recommendations for finding out more information?



Updated by Alita Nandi over 3 years ago

  • Status changed from New to In Progress
  • % Done changed from 0 to 50

Updated by Olena Kaminska over 3 years ago


The best variables for nonresponse correction are the ones related to your y variable and to nonresponse (at the same time). But they need to be available for respondents and nonrespondents.
If you think that only one or two variables have much missing information and otherwise not much extra missingness with our weights - you may want to consider imputing values of these variables.
Otherwise you can correct for all the missingness (people not responding and missingness in answers) through tailored weighting.
I will be happy to share with you an example of how one can create their own weight. If you are interested, could you email your request to this email please: and mention the reference number of this discussion: num1305.

Finally, you can still use the suboptimal weight that was mentioned earlier in the meantime for your analysis.

Thank you,


Updated by Stephanie Auty over 3 years ago

  • Status changed from In Progress to Feedback
  • Assignee changed from Olena Kaminska to Charlotte Edney
  • % Done changed from 50 to 80

Updated by Understanding Society User Support Team almost 2 years ago

  • Status changed from Feedback to Resolved
  • Assignee deleted (Charlotte Edney)
  • % Done changed from 80 to 100

Also available in: Atom PDF