Project

General

Profile

Support #1196 » stata-matching-individuals-within-household - do file from UKHLS website [adapted for T2].do

attachment 3 - my edits using 1 and 2 - fabiana macor, 06/06/2019 12:32 PM

 
/*****************************************************************************************
* MATCHING INDIVIDUALS WITHIN A HOUSEHOLD *
* In this example we will match the information of respondents living with *
* partners/spouses onto that of their partners/spouses.
*
* RESULTING FILE NAME(s): "`w'_spinfo" saved in each wave folder *
*****************************************************************************************/

// file location
cd "C:\Users\fabia\Documents\7 Masters\UPF\Thesis\Data"

// assign global macro to refer to Understanding Society data
global ukhls "C:\Users\fabia\Documents\7 Masters\UPF\Thesis\Data\UKHLS Datafiles"


// PART 1: LINKING PARTNER DATA (using syntax from website)

foreach w in a b c d e f g h { /* Q: aka everything in the braces is done for a b c d etc ? */
// find the wave number. Q: we want to denote the wave number by looking at the string position aka wave 1 is a (number one in string list). The name of this local macro is 'waveno' but what does '`w'' denote??
local waveno=strpos("abcdefghijklmnopqrstuvwxyz","`w'")
// open the individual level file. Can I add more variables here other than jbhas?
use pidp `w'_* using "$ukhls/ukhls_w`waveno'/`w'_indresp", clear

// Restrict to individuals who have a spouse/partner in the household
// If an individual does not have a partner then `w'_ppno will be 0,
// if they do have a partner then `w'_ppno is the pno of their partner
keep if `w'_ppno>0

// rename all individual characteristics to something that would indicate
// the characteristics refer to the spouse/partner. Here the prefix sp_
// before the variable stem name and preserve the wave prefix
rename `w'_* `w'_sp_*

// rename the spouse/partner pno variable to respondent pno for matching to their partner
rename `w'_sp_ppno `w'_pno

// rename the hidp back to `w'_hidp
rename `w'_sp_hidp `w'_hidp

// drop the variable `w'_sp_pno as it is no longer needed
drop `w'_sp_pno

// save the file temporarily. here we have basically copied the datafile but added sp to the front of all but the hidp and pno variables
save `w'_tmp_spinfo, replace



// reopen data file for all enumerated individuals and keep the same set of variables
use `w'_* using "$ukhls/ukhls_w`waveno'/`w'_indresp", clear

// restrict the variables to individuals who have a spoise/partner in the household
keep if `w'_ppno>0

// merge the data with the data relating to the spouse/partner, using
// `w'_hidp and `w'_pno as linking variables. Note that there SHOULD NOT BE
// any non-matching records, that is, the value of _merge=3
merge 1:1 `w'_hidp `w'_pno using tmp_spinfo

// drop the merge variable otherwise future merges will not work
drop _merge

// save the data file
save "$ukhls/ukhls_w`waveno'/`w'_spinfo", replace

// clean up unwanted files
erase `w'_tmp_spinfo.dta
}

//


// PART 2: MERGING TO LONG (using syntax from website)

foreach w in a b c d e f g h { /* Q: aka everything in the braces is done for a b c d etc ? */
// find the wave number. Q: we want to denote the wave number by looking at the string position aka wave 1 is a (number one in string list). The name of this local macro is 'waveno' but what does '`w'' denote??
local waveno=strpos("abcdefghijklmnopqrstuvwxyz","`w'")
// open the individual level file. Can I add more variables here other than jbhas?
use pidp `w'_* using "$ukhls/ukhls_w`waveno'/`w'_spinfo", clear
// drop the wave prefix from all variables
rename `w'_* *
// create a wave variable
gen wave=`waveno'
// save one file for each wave
save temp`w', replace
}

// open the file for the first wave (wave a_)
use tempa, clear

// loop through the remaining waves
foreach w in b c d e f g {

// append the files for the second wave onwards
append using temp`w'
}

// check how many observations are available from each wave
tab wave

// save the long file
save longfile, replace

// erase temporary files
foreach w in a b c d e f g {
erase temp`w'.dta
}
//


(3-3/3)