I am using the xwavedat file to match stable characteristics of individuals in the main survey, but I noticed that on some of the background characteristics there is a large proportion of 'missing' responses. I am using xwavedat from the 2009-2014 download package.

I am especially interested in the parental characteristics when the respondent was aged 14, and ethnicity.
I know that the parental background variables are only asked once, presumably when the respondent joins the adult questionnaire? There still seem to be a lot of missings even after accounting for this, e.g. overall, maedqf has 57.1% missing responses. When you disaggregate this by age, there is a high proportion of missings within those who are aged 16-21 at some point between Wave 2 and Wave 5 (birth years 1989 to 1999), with 76% missing responses for maedqf. Are you able to help me with why these variables have such high proportions of missing data?

Ethnicity also has a relatively high proportion of missings in xwavedat, with 45% of those in the birth years 1989-1999 missing data on this variable. Is this to be updated?




Hi Meg,

This file includes every person who has ever been part of a sampled and enumerated household. Which means it includes children and non-responding adults. The quesions you are referring to were asked of adult respondents and so we will not have this information for children and adult non-respondents.

Best wishes,


Hi Alita,

Thanks for your quick response.

I understand that this question would not be asked of children 11-15. However, I am looking mostly at the young adults module, ages 16-21, and I think there are still a lot of missings even among those adults who are respondents. When I include parental characteristics, it reduces my sample size by about 2/3 and these are all individuals who have responded to other questions in the survey, but apparently have not responded to questions about parental characteristics. Do you know why this would be?




Hi Megan,

1. Parental education was asked of the first 6 months sample (that is a quarter of the UKHLS sample) in wave 1, the remaining sample was asked in wave 2.

2. These questions are part of the intiial conditions section and generally asked when a person is interviewed as an adult for the first time.
So, it is missing for children and those adults who were never interviewed.
Also, those who gave a youth interview when they were 10-15 are treated as having been interviewed before and so when they turn 16, they are not asked the initial conditions questions. In most cases, you can get some of this information from the information reported by them during the youth interview (ethnic group is asked in alternate waves) or their parents (education - hiqual_dv).

3. Additionally, these questions are not part of the proxy questionnaire and so will be missing for those who were interviewed by proxy.

  • Proxy Interview: When individuals are unable to give an interview but their spouse/adult children volunteers to provide some information on their behalf, a proxy interview is conducted - this is a shorter questionnaire which mostly consists of factual information that a partner or adult child is expected to know.

Best wishes,


