Support #965
openSample weights
100%
Description
Dear USoc Team,
Slightly more urgently than my previous question (I have reconciled it now) I was wondering what probability weights I should use?
I am conducting a cross-sectional analysis on wave G, investing the question of how alcohol consumption impacts individual earnings.
My sample is restricted to 25-65 year old and in one model only uses information from wave G, in another model information is merged from wave A to use religious variables as instruments.
I've been using the pweight = g_indscui_lw
However, I am not sure if this is correct and it substantially reduces my sample size during regressions.
I'd be grateful for any urgent advice (dissertation due on Monday)!
Thanks!
Updated by jason morgan over 6 years ago
- Copied from Support #951: Deriving alcohol consumption units added
Updated by Stephanie Auty over 6 years ago
- Status changed from New to In Progress
- Assignee changed from Stephanie Auty to Olena Kaminska
- Private changed from Yes to No
Many thanks for your enquiry. The Understanding Society team is looking into it and we will get back to you as soon as we can.
Best wishes,
Stephanie Auty - Understanding Society User Support Officer
Updated by Stephanie Auty over 6 years ago
- % Done changed from 10 to 20
Dear Jason,
I'm not sure if Olena will see this in time to answer you before the weekend, so I wanted to provide a link to the User Guide which has information for helping you to select a weight in tables 31-40: https://www.understandingsociety.ac.uk/sites/default/files/downloads/documentation/mainstage/user-guides/mainstage-waves-1-7-user-guide.pdf
You will see that weights for cross sectional analysis end in _xw.
Best wishes,
Stephanie Auty - Understanding Society User Support Officer
Updated by jason morgan over 6 years ago
Thank you so much!!!
(This left me with just 2 options and both produced almost identical results so I am going with g_indinub_xw).
Just one brief last question (although no worries at all if you don't know) but anytime I use sample weights, I lose about 13% on the observations when I run regressions, do you know if this is normal? (I've been googling this a lot but to no avail)
Thanks again for your really quick reply!
Updated by Olena Kaminska over 6 years ago
Jason,
Thanks for your question. The best weights for you are: for cross-sectional analysis it is g_indinui_xw (this one has higher sample size as it also incorporates refreshment in wave 6 - note, a large part of 0-weights in g_indinub_xw is due to omitting this refreshment but your analysis will still be correct). For longitudinal analysis (where you use wave A and G in the same regression) you should use g_indinus_lw (this will have many 0-weights as it will exclude BHPS and IEMB subsamples). Note, using xw weight for the longitudinal analysis is incorrect.
Dropped cases when you use weights is correct - but this is unique to longitudinal studies and largely unique to UKHLS reflecting its very complex sample design.
Hope this helps and best of luck with your dissertation,
Olena
Updated by Stephanie Auty over 6 years ago
- Status changed from In Progress to Feedback
- Assignee changed from Olena Kaminska to jason morgan
- % Done changed from 20 to 70
Updated by Stephanie Auty over 6 years ago
- Status changed from Feedback to Resolved
- % Done changed from 70 to 100
Updated by Understanding Society User Support Team almost 4 years ago
- Assignee deleted (
jason morgan)