Support #1247

Share of EU born respondents with UK citizenship only

Added by Marina Fernandez Reino over 4 years ago. Updated over 3 years ago.

Start date:
% Done:




This is to ask you about the unusually high share of EU born respondents reporting only British citizenship (that is, no dual UK-EU nationality) in waves a_ to e_ (around 30%). In wave h_ this share is 8%, which makes more sense. I am aware that those respondents holding an UK passport in wave x are not asked about their citizenship again in wave x + 1 nor in subsequent waves.

My objective is to identify the citizenship(s) of wave h_ respondents, including those with dual nationality. That means I need to take into account the answers given at previous waves by those respondents who are not asked the citizenship question in wave h_ (h_citzn1==-8).

I’ve merged the citizenship variables in indresp from waves a_ to h_ and created a file named citizen.dta. My new citizenship variable uses the values of wave h_ and wave a_ only for those respondents that are not asked the question in wave h_. The final variable shows that there is an unusually high share of EU born respondents with UK citizenship only. This high share cannot be accounted for by naturalisations. Since 1990, there have been 250,000 naturalisations of EU citizens, which is about 8% of the population who arrived since that year.

I’ve attached a do file with the code I've used to calculate this (it starts in line 23)


Citizenship (14.8 KB) Citizenship Do file Marina Fernandez Reino, 09/24/2019 05:12 PM

Updated by Stephanie Auty over 4 years ago

  • Status changed from New to In Progress
  • Assignee changed from Alita Nandi to Stephanie Auty
  • % Done changed from 0 to 10
  • Private changed from Yes to No

Many thanks for your enquiry. The Understanding Society team is looking into it and we will get back to you as soon as we can.

Best wishes,
Stephanie Auty - Understanding Society User Support Officer


Updated by Stephanie Auty over 4 years ago

  • Category changed from Data analysis to Weights
  • Assignee changed from Stephanie Auty to Olena Kaminska
  • % Done changed from 10 to 50

Dear Marina,

Apologies for the delay in getting back to you.

Firstly, in calculating your final variable you will need to use data from all waves, not just Waves 1 and 8 as people who joined the survey between these waves will have answered in the wave they joined.

In your tabulation at the end of your syntax file, you are not weighting the data. In Wave 6 we introduced the immigrant and ethnic minority boost sample, so that is why you are seeing a large jump in types of answers. I will assign this issue to our survey statistician next to discuss which weights would make the results comparable.

Best wishes,


Updated by Olena Kaminska over 4 years ago


I agree with Stephanie that you need to use weights to make sure you account correctly for the immigrant and ethnic minority boost, among other things, as well as ethnic minority boost that started at wave 1. Let me know if you need help with selecting correct weights for your analysis.

If you haven't used weights in you analysis, most likely this would explain the differences in your estimates. Please let us know if you still observe these differences once you use our weights.

Just to point one important thing: UKHLS is a longitudinal study. Importantly it does not represent recent immigrants in the years in-between boosts. In other words it represents the cross-sectional population including all immigrants in 1991 (GB only), 2001 (NI only), in 2009-10 (UK) and in 2014-15 (UK). In the years for example between 2009-10 and 2014-15 only immigrants that move in with people who were in UK before 2009-10 are represented, and even these are represented in lower proportions than in the cross-sectional population. We recognize this and this was one of the reasons we boosted our sample at wave 6, specifically concentrating on recent immigrants among other groups. Wave 6 onwards will also have large sample size for immigrants which would provide more precise estimates. So please if you compare estimates before and after wave 6 that relate to immigrants it is worth checking confidence intervals (weighted of course).

Hope this helps,


Updated by Olena Kaminska over 4 years ago


Apologies. I just realized that you very helpfully included your syntax. I just had a look at it and noticed that you are using longitudinal weights. I feel that cross-sectional weights may be more appropriate for you. Can you confirm that the only reason you use longitudinal weights is because you created citizenship variable using all the previous waves? If so, and you are not analysing people over time, you can use xw weights. This is because in order to be in your analysis people didn't have to participate in each wave 1-8, but had to have a value from at least one wave. If this is correct - try to use xw weights.



Updated by Marina Fernandez Reino over 4 years ago

Hi Stephanie and Olena

Thank you very much for your answers. I confirm that they only reason I was using longitudinal weights is because I've created the citizenship variables using all previous waves. I will use xw weights then. Thanks!



Updated by Understanding Society User Support Team over 3 years ago

  • Status changed from In Progress to Resolved
  • Assignee deleted (Olena Kaminska)
  • % Done changed from 50 to 100

Also available in: Atom PDF