Support #1141
openWeighting
100%
Description
Hi,
I am currently working on a project looking at financial security, I am looking at the variable _finnow in every wave and am looking to calculate a mean which is representative of the UK population for each wave. However, I am struggling to work out which is the correct weighting to use. I have currently used _xewght for wave 1 of the BHPS and _indpxui_xw for wave 8 of the UKHLS however I am unsure if this is correct. Please could you advise which is the most appropriate weighting to use for each wave in order gain an output representative of the UK population at the time that wave was carried out.
Thanks
Updated by Stephanie Auty almost 6 years ago
- Private changed from Yes to No
Many thanks for your enquiry. The Understanding Society team is looking into it and we will get back to you as soon as we can.
Best wishes,
Stephanie Auty - Understanding Society User Support Officer
Updated by Olena Kaminska almost 6 years ago
Dear Freya,
Thank you for your question. I am not sure how long back you are looking at. But overall, you are right in using the cross-sectional weight that is most inclusive. You can use us weight for wave 1 of UKHLS, ub weight between wave 2 and wave 5, and ui weight starting at wave 6 onwards. If you are looking back at BHPS, you can use cross-sectional weights as well. But overall please be aware of the differences in the population that the data represents. For example us weight represents people who lived in the UK continuously since 2007/2008 (excluding immigrants that came to the country since). But ui weight represents people who lived in the UK continuously since 2012/2013. You may see the jump therefore in your estimates over time between wave 5 and wave 6 estimates (which use su and ui weights). If so, it may be better to use us or ub weight consistently - this will exclude more recent immigrants. If you don't see any jump, then immigrants do not influence your estimates so much - and you may continue to use us, ub and ui weights when appropriate.
Hope this helps,
Olena
Updated by Freya Cook almost 6 years ago
Dear Olena
Many thanks for your very helpful email. I have a follow-on question which relates to how I use the chosen weight in my logistic regression analyses which I am carrying out using STATA. I am currently specifying the weight using the syntax:
logistic h_finnow ib1.h_age_dv [ pw= h_indpxui_xw ] , base vce(cluster h_hidp)
This allows for the weighting and clustering by household (h_hidp). However, I do not think it correctly estimates the standard errors. I am wondering if I need to use the svy commands instead? My understanding is that this will provide an appropriate (and larger) estimate of the standard errors. Thus I am proposing to use:
svyset h_ppid [pw=h_indpxui_xw] || h_hidp
svy: logistic h_finnow ib1.h_age_dv , base
Many thanks
Freya Cook
Updated by Olena Kaminska almost 6 years ago
Freya,
Yes, svy command is better suited for your purpose. This style of svyset command may be better, and also include strata in your analysis:
svyset psu [pweight=pw], strata(strata)
And then use logistic regression with svy:
Does this help? There are more examples in UKHLS training course materials.
Olena
Updated by Stephanie Auty almost 6 years ago
- Status changed from New to Feedback
- Assignee changed from Olena Kaminska to Freya Cook
- % Done changed from 0 to 70
Updated by Understanding Society User Support Team over 2 years ago
- Status changed from Feedback to Resolved
- % Done changed from 70 to 100