longitudinal weighting adjustment for nonresponse since the last wave
I am not an expert in sampling weight. Please correct me if I am wrong. Thanks.
I am looking at the technical details of weighting adjustments. In the user guide (2018 version) p76 188.8.131.52, it says
"In the model for proxy and main interviews, covariates were taken from the Wave (n-1) proxy interview (or the equivalent items from the main interview), household grid and household questionnaire."
I suppose that the covariates should be some variables from the files. Could you please provide more info about this? For example, what covariates are taken into account, and why these covariates are selected?
Updated by Alita Nandi over 3 years ago
- Status changed from New to In Progress
- Assignee set to Olena Kaminska
- % Done changed from 0 to 10
- Private changed from Yes to No
The Understanding Society team is looking into it and we will get back to you as soon as we can.
Understanding Society User Support Team
Updated by Olena Kaminska over 3 years ago
I assume you are interested in a more general question on the covariates that are used in a specific weight. Please be aware that a weight in wave n is a composite of models, starting from wave 1, and going through all the models between wave 1 and wave n - one or two (sometimes more) models usually per wave. The largest nonresponse has happened in wave 1 and a household level (so this would be the biggest percentage of the total nonresponse that you are interested in). We used a number of different predictors at wave 1 - notably they diffed for England, Wales and Scotland as the predictors were taken from Census and other available geo-linked statistics. After this we use predictors from questionnaires at each wave as described in the technical report. In the models we try to include as many predictors as possible as long as they are available for all the relevant population (so not those that are asked to a subpopulation only). We avoid multicolliniarity. We then use a statistical approach to selecting predictors that actually are used in a model: we use a backward stepwise logistic regression with p=0.01 or p=0.05 depending on the total sample size in the model. This means that only variables that add to the prediction of nonresponse are used in the model.
We are aware of the interest in the specific predictors in our models. But unfortunately because of the size of the task and the number of such models we are unable to provide a full guide to the predictors used in the models yet.
I hope this helps but please do not hesitate to ask me further questions,
Updated by Louise Luo over 3 years ago
Thank you very much for your reply. It is a very professional answer.
I am not sure if you are familiar with BHPS. Its user manual (vol. a) suggests some variables that are used for non-respondent adjustment across waves:
"Whether moved from the previous wave address; individual characteristics such as age, sex, employment status, income total and composition, race, level of organisational membership, educational qualifications, etc. and household characteristics such are region, tenure, number of cars and ownership of consumer durables" (for details see A5-9)
I was wondering if UKHLS uses the similar techniques as well.
Many thanks for your time. Sorry for bothering you.