Citizenship UKHLS.do - Understanding Society User Support

Support #1247 » Citizenship UKHLS.do

Do file - Marina Fernandez Reino, 09/24/2019 05:12 PM

    
    // CITIZENSHIP ANALYSIS

    /* NOTE: 

    Respondents born in the UK are not asked the citizenship questions. 

    But as anyone born in 1983 whose neither parents were British citizen or settled in the UK were NOT automatically granted UK citizenship, 

    it is not possible to identify from this data whether such persons are UK citizens.

    Across six waves of data (and 81540 respondents), the number of individuals born on or after 1983 in the UK with 

    both parents born outside the UK is 1590. It is not possible to identify whether these 2% are UK citizens. 

    Note, of these if at least one of their parents had acquired UK citizenship or had become permanent UK residents before their birth

    (there are a few other such clauses), then they would have UK citizenship.

    Also, foreign born respondents who are UK citizens are not asked again about their citizenship in subsequent waves.

    */

    cd "N:\DATA\Understanding Society\indresp files\"

    ******************************************************************************************************

    ** 1st step: I merge h_indresp_protect + xwavedat_protect + citizenship variables from previous waves: 

    ******************************************************************************************************

    foreach let in a  b  c  d  e  f  g   {

    use "`let'_indresp_protect.dta", clear

    sort pidp

    keep pidp `let'_pno `let'_citzn1 `let'_citzn2 `let'_citzn3 `let'_citzn_code `let'_citzn2_code   

    save ctz_`let', replace

    }

    use "h_indresp_protect.dta", clear

    keep pidp h_pno h_citzn1 h_citzn2 h_citzn3 h_citzn_code h_citzn2_code  ///

    h_indnsub_lw h_indbdub_lw h_indpxub_xw h_indinub_xw h_indscub_xw h_indpxui_xw h_indinui_xw h_indscui_xw h_ind5mus_xw h_strata h_psu   ///

    h_ind5mus_lw h_indbd91_lw h_indbdub_lw h_indin01_lw h_indin91_lw h_indinub_lw h_indinui_lw h_indinus_lw h_indns91_lw h_indnsub_lw h_indpxub_lw /// 

    h_indpxui_lw h_indpxus_lw h_indscub_lw h_indscui_lw h_indscus_lw h_xtra5minosm_dv

    sort pidp

    save ctz_h, replace

    	use "xwavedat_protect.dta"

    	sort pidp

    	merge 1:1 pidp using "ctz_a.dta"

    	sort pidp

    	drop _merge

    	merge 1:1 pidp using "ctz_b.dta"

    	sort pidp

    	drop _merge

    	merge 1:1 pidp using "ctz_c.dta"

    	sort pidp

    	drop _merge

    	merge 1:1 pidp using "ctz_d.dta"

    	sort pidp

    	drop _merge

    	merge 1:1 pidp using "ctz_e.dta"

    	sort pidp

    	drop _merge

    	merge 1:1 pidp using "ctz_f.dta"

    	sort pidp

    	drop _merge

    	merge 1:1 pidp using "ctz_g.dta"

    	sort pidp

    	drop _merge

    	merge 1:1 pidp using "ctz_h.dta"

    	sort pidp

    	save "citizen.dta", replace

    * This is the name of my dataset ( h_indresp_protect + xwavedat_protect + citizenship variables from previous waves)

    	use "citizen.dta", replace

    ****************************************************************************************

    ** 2nd step: I generate a country of birth variable for respondents and their parents	

    ****************************************************************************************	

    	foreach cob in  plbornc_all  macob_all pacob_all {

    					gen `cob'_det=.

    					// UK born		

    						  replace `cob'_det=1 if `cob'==195 | `cob'==206 | ///

    						  `cob'==1 | `cob'== 2 | `cob'==3 | `cob'==4 

    						   // When country of birth in plbornc_all = -9 (missing), it means that these people is born in the UK. Therefore: 

    						   replace plbornc_all_det=1 if bornuk_dv==1   

    					// Ireland

    						 replace `cob'_det=2 if `cob'==5 

    					// EU-13

    						replace `cob'_det=3 if  `cob'==6 |  `cob'==7 |  `cob'==8 |  `cob'==9 |  `cob'==353 |  ///

    						`cob'==123 | `cob'==132 | `cob'==312 | `cob'==346 | `cob'==179 | `cob'==198 | `cob'==209 | `cob'==296 | `cob'==317

    					// EU-8  EU-2  EU Other

    						replace `cob'_det =4 if  `cob'==10 |  `cob'==11 | `cob'==147 | `cob'==318 | `cob'==332 | `cob'==333 | ///

    						`cob'==173 | `cob'==176 | `cob'==269 | `cob'==259 | `cob'==252 | `cob'==192 | `cob'==223 | `cob'==177

    					// Other Europe (non EU)

    						replace `cob'_det=5 if `cob'==12 |  `cob'==347 |  `cob'==365 |  `cob'==108 | ///

    						`cob'==140 | `cob'==320 | `cob'==327 | `cob'==385 | `cob'==248 | `cob'==263 | `cob'==205 | `cob'==248 | ///

    						`cob'==278 | `cob'==225 | `cob'==281 | `cob'==131 | `cob'==279 | `cob'==118 | ///

    						`cob'==160 |  `cob'==214 | `cob'==233 | `cob'==238 

    					// MENA and Central Asia

    						replace `cob'_det=6 if `cob'==348 |  `cob'==361 |  `cob'==368 |  `cob'==384  | `cob'==110 |  ///

    						 `cob'==314 | `cob'==325 | `cob'==127 | `cob'==188 | `cob'==301 | ///

    						`cob'==256 | `cob'==230 | `cob'==231 | `cob'==241 | `cob'==243 | `cob'==253 | `cob'==249 | ///

    						`cob'==283 | `cob'==234 | `cob'==101 | `cob'== 350 | `cob'== 184 | `cob'==104 | `cob'==102 

    					// India

    						replace `cob'_det=7 if `cob'==18 | `cob'==242 // kashmir included here

    					// South Asia except India

    						replace `cob'_det=8 if `cob'==19 |  `cob'==20 |  `cob'==21 |  `cob'==288 

    					// East Asia & South East Asia

    						replace `cob'_det=9 if `cob'==17 |  `cob'==349 |  `cob'==352 |  `cob'==377 | `cob'==331 | `cob'==146 |  ///

    						`cob'==309 | `cob'==266 | `cob'==285 | `cob'==228 | `cob'==236 | `cob'==274 | `cob'==247 | `cob'==151 | `cob'==251

    						replace `cob'_det=9 if `cob'==17 // China hong kong

    					// Sub-Saharan Africa

    						  replace `cob'_det=10 if `cob'==22 |  `cob'==351 |  `cob'==355 |  `cob'==386 |  `cob'==387 | ///

    						  `cob'==23 | `cob'==24 | `cob'==25 | `cob'==103 | `cob'==105 | `cob'==106 | ///

    						  `cob'==112 | `cob'==335 | `cob'==343 | `cob'==326 | `cob'==330  |  ///

    						  `cob'==149 | `cob'==152 | `cob'==154 | `cob'==321 | `cob'==168 | `cob'==178 | `cob'==191 |  ///

    						  `cob'==264 | `cob'==265 | `cob'==284 | `cob'==286 | `cob'==193 | `cob'== 202 | `cob'==203 | ///

    						  `cob'==215 | `cob'==216  | `cob'==255 | `cob'==235 | `cob'==273 | `cob'==328 | `cob'==26 | `cob'==148 | ///

    						  `cob'==141 | `cob'==159 | `cob'==323

    					// US, Canada,  Australia, New Zealand

    						replace `cob'_det=11 if `cob'==13 |  `cob'==14 |  `cob'==15 |  `cob'==16 

    					// Other

    						replace `cob'_det=12 if `cob'==376 |  `cob'==382 |  `cob'==388 |  `cob'==997 |  `cob'==27 |  ///

    						`cob'==113 | `cob'==114 | `cob'==116 | `cob'==126 | `cob'==129 | `cob'==137 | `cob'==359 | ///

    						`cob'==142 | `cob'==165 | `cob'==161 | `cob'==336 | `cob'==341 | `cob'==339 | ///

    						`cob'==338 | `cob'==337 | `cob'==135 | `cob'==145 | `cob'==183 | `cob'==182  | ///

    						`cob'==187 | `cob'==189 | `cob'==330  | `cob'==275 | `cob'==303 | `cob'==174 |  `cob'==307 | ///

    						`cob'==197 | `cob'==211 | `cob'==213 | `cob'==217 | `cob'==282 | `cob'==305 | `cob'==334 | `cob'== 364 | ///

    						`cob'==133 | `cob'==344 | `cob'==291

    							}

    					rename plbornc_all_det cob_det

    					rename macob_all_det macob_det

    					rename pacob_all_det pacob_det

    					label define cob_det   ///

    					1 "UK"    ///

    					2 "Ireland"     ///

    					3 "EU-13"      ///

    					4 "EU-8, EU-2, EU Other"    ///

    					5 "Other Europe"    ///

    					6 "MENA & Central Asia"    ///

    					7 "India"     ///

    					8 "Pakistan & other South Asia"     ///

    					9 "East & Southeast Asia"     ///

    					10 "Sub-Saharan Africa"     ///

    					11 "Australia, NZ Canada, US"     ///

    					12 "Other countries", replace

    					lab val cob_det macob_det pacob_det  cob_det		

    			// There are still 661 respondents with no info in country of birth (bornuk_dv==-9) 

    			// There are also 318  we know they are not UK born, but we don't know the country (bornuk_dv==2 & cob_det==.)

    					replace cob_det=.a if bornuk_dv==-9

    					replace cob_det=.b if bornuk_dv==2 & cob_det==.

    					labelmiss cob_det .a "No info on cob" .b "Foreign born but unknown country", modify

    					replace cob_det=2 if cob_det==. & plbornc==5

    					replace cob_det=3 if cob_det==. & (plbornc==6 | plbornc==7 |  plbornc==8 | plbornc==9)

    					replace cob_det=4 if cob_det==. & plbornc==10 | plbornc==11

    					replace cob_det=5 if cob_det==. & plbornc==12

    					replace cob_det=7 if cob_det==. & plbornc==18

    					replace cob_det=8 if cob_det==. & (plbornc==19 | plbornc==20 | plbornc==21)

    					replace cob_det=9 if cob_det==. & plbornc==17

    					replace cob_det=10 if cob_det==. & (plbornc>=22 &  plbornc<=26)

    					replace cob_det=11 if cob_det==. & (plbornc==13 | plbornc==14 |  plbornc==15 | plbornc==16)

    					replace cob_det=12 if cob_det==. & plbornc==27

    					replace macob_det =.a if macob_det== -9

    					replace macob_det =.c if macob_det== -20

    					labelmiss macob_det .a "No info" .c "No info from BHPS", modify

    					replace pacob_det =.a if pacob_det== -9

    					replace pacob_det =.c if pacob_det== -20

    					labelmiss pacob_det .a "No info" .c "No info from BHPS", modify

    					recode cob_det (1=1 "UK born") (2 3 4=2 "EU born") (5/12=3 "Non-EU born"), gen(cobeu)

    **************************************************************************************

    * 3rd step: Generating a summary variable for respondents' citizenship(s) in each wave

    **************************************************************************************

    		foreach x in a b c d e f g h {

    		gen `x'_citizenship=.

    		replace `x'_citizenship = 1 if  cob_det==1 & `x'_pno!=.  // Born in the UK and thus not asked about citizenship

    		replace `x'_citizenship = 2 if  `x'_citzn1==-7 // Proxy respondents 

    		replace `x'_citizenship = 3 if  `x'_citzn1==1 & `x'_citzn2==0 & `x'_citzn3==0 // Foreign born with UK citizenship ONLY

    		replace `x'_citizenship = 4 if  `x'_citzn1==1 & (`x'_citzn2==1 | `x'_citzn3==1)  // Foreign born with UK and Other citizenship

    		replace `x'_citizenship = 5 if  `x'_citzn1==0 // Foreign born without UK citizenship

    		replace `x'_citizenship = 6 if  `x'_citizenship==. &  `x'_pno!=. & `x'_citzn1==-8 // Inapplicable

    		}

    		lab define citizenship    ///

    		1 "Born in the UK"   ///

    		2 "Proxy respondents"   ///

    		3 "Foreign born UK citizens ONLY"   ///

    		4 "Foreign born UK + Other citizenship"   ///

    		5 "Foreign born Other citizenship ONLY"   ///

    		6 "Inapplicable", replace

    		lab val *_citizenship citizenship

    		** Here I am just identifying respondents mentioning British citizenship (x_citizenshipUK), EU citizenship (x_citizenshipEU) and Non-EU citizenship (x_citizenshipNONEU):

    			foreach x in a b c d e f g h {

    			gen `x'_citizenshipUK =. 

    			replace `x'_citizenshipUK=1 if  `x'_citzn1==1

    			replace `x'_citizenshipUK=1 if  `x'_citzn_code==6 | `x'_citzn2_code==6 |  `x'_citzn_code==44 | `x'_citzn2_code==44

    			gen `x'_citizenshipEU = .  

    				replace `x'_citizenshipEU=1 if (`x'_citzn2==1 & cobeu==2)

    				replace `x'_citizenshipEU=1 if (`x'_citzn3==1) & ((`x'_citzn_code>=1 & `x'_citzn_code<=5) | (`x'_citzn_code>=7 & `x'_citzn_code<=12) | `x'_citzn_code==30 | `x'_citzn_code==32 | `x'_citzn_code==38 | `x'_citzn_code==46 | (`x'_citzn_code>=53 & `x'_citzn_code<=68) | `x'_citzn_code==900 | `x'_citzn_code==901)

    				replace `x'_citizenshipEU=1 if (`x'_citzn3==1) & ((`x'_citzn2_code>=1 & `x'_citzn2_code<=5) | (`x'_citzn2_code>=7 & `x'_citzn2_code<=12) | `x'_citzn2_code==30 | `x'_citzn2_code==32 | `x'_citzn2_code==38 | `x'_citzn2_code==46 | (`x'_citzn2_code>=53 & `x'_citzn2_code<=68) | `x'_citzn2_code==900 | `x'_citzn2_code==901)

    			gen `x'_citizenshipNONEU = .  	

    				replace `x'_citizenshipNONEU=1 if (`x'_citzn2==1 & cobeu==3)

    				replace `x'_citizenshipNONEU=1 if (`x'_citzn3==1) & ((`x'_citzn_code>=14 & `x'_citzn_code<=28)  | `x'_citzn_code==36 | `x'_citzn_code==37 | `x'_citzn_code==43 | `x'_citzn_code==45 | `x'_citzn_code==52 | (`x'_citzn_code>=70 & `x'_citzn_code<=890) | (`x'_citzn_code>=902 & `x'_citzn_code<=997 ))

    				replace `x'_citizenshipNONEU=1 if (`x'_citzn3==1) &  ((`x'_citzn2_code>=14 & `x'_citzn2_code<=28)  | `x'_citzn2_code==36 | `x'_citzn2_code==37 | `x'_citzn2_code==43 | `x'_citzn2_code==45 | `x'_citzn2_code==52 | (`x'_citzn2_code>=70 & `x'_citzn2_code<=890) | (`x'_citzn2_code>=902 & `x'_citzn2_code<=997 ))

    	** This is a summary variable that takes into account whether respondents mentioned a UK, EU or non-EU citizenship:

    			gen `x'_citizenship2=.

    				replace `x'_citizenship2=1 if `x'_citizenshipUK==1 &  `x'_citizenshipEU ==. & `x'_citizenshipNONEU==. // UK only

    				replace `x'_citizenship2=2 if `x'_citizenshipUK==1 &  `x'_citizenshipEU ==1 & `x'_citizenshipNONEU==. // UK and EU

    				replace `x'_citizenship2=3 if `x'_citizenshipUK==1 &  `x'_citizenshipEU ==. & `x'_citizenshipNONEU==1  // UK and non-EU

    				replace `x'_citizenship2=4 if `x'_citizenshipUK==1 &  `x'_citizenshipEU ==1 & `x'_citizenshipNONEU==1  // UK , EU and Non-EU

    				replace `x'_citizenship2=5 if `x'_citizenshipUK==. &  `x'_citizenshipEU ==1 & `x'_citizenshipNONEU==.  // EU only

    				replace `x'_citizenship2=6 if `x'_citizenshipUK==. &  `x'_citizenshipEU ==. & `x'_citizenshipNONEU==1  // Non-EU only

    				replace `x'_citizenship2=7 if `x'_citizenshipUK==. &  `x'_citizenshipEU ==1 & `x'_citizenshipNONEU==1   // EU and non-EU

    					}

    				lab define citizenship2  ///

    				1 "UK only"  ///

    				2 "UK & EU"  ///

    				3 "UK & Non-EU"  ///

    				4 "UK, EU & Non-EU"  ///

    				5 "EU only"  ///

    				6 "Non-EU only"  ///

    				7 "EU & non-EU", replace

    				lab val  *_citizenship2 citizenship2

    ***********************************************************

    * 4th step: Citizenship variable for wave h_ respondents:

    ***********************************************************

    // I take h_citizenship as the reference because I am interested in wave h_ respondents. 

    // This new citizenship variable uses the values of wave h_ and wave a_ only for those respondents that are not asked the question in wave h_.

    		gen w_citizenship=h_citizenship  

    		label values w_citizenship citizenship

    		replace w_citizenship=a_citizenship if h_citizenship==6  

    		gen w_citizenship2= h_citizenship2  // I want to identify the nationalities, so I take into account the x_citizenship2 variable generated above

    		replace w_citizenship2=a_citizenship2 if w_citizenship!=. &  w_citizenship2==.

    		replace w_citizenship2=8 if w_citizenship==1

    		replace w_citizenship2=9 if w_citizenship==2

    		lab define w_citizenship2 ///

    		1 "UK only"  ///

    		2 "UK & EU"  ///

    		3 "UK & Non-EU"  ///

    		4 "UK, EU & Non-EU"  ///

    		5 "EU only"  ///

    		6 "Non-EU only"  ///

    		7 "EU & non-EU" ///

    		8 "Born in the UK"   ///

    		9 "Proxy respondents"   , replace

    		lab values w_citizenship2 w_citizenship2

    ******************

    ** TABULATION ****

    ******************

    // WEIGHTS  --> I need to use longitudinal weights because I am using information from wave 1, not only of wave 7

    svyset  h_psu  [pweight=h_indpxui_lw], strata(h_strata)  singleunit(scaled)   // when including proxy respondents

    svyset  h_psu  [pweight=h_indinui_lw], strata(h_strata)  singleunit(scaled)   // when NOT including proxy respondents

    svy: tab w_citizenship2 cobeu if w_citizenship2!=9, col // This shows an usually high share of EU born holding UK nationality only.

    // I examine this wave by wave to understand what is going on:

    	foreach x in a b c d e f g h {

    	tab `x'_citizenship2 cobeu if `x'_citizenship2!=9, col 

    	}

(1-1/1)

Project

General

Profile

Understanding Society User Support

Support #1247 » Citizenship UKHLS.do