Support #692
closedmerging _indresp and _child across all waves
100%
Description
Dear Ladies and Gentlemen,
I hope this question is USoc-specific enough to post it here. With Stata, I would like to merge the files _indresp and _child across all six waves. In the end, I would like to have the age and sex of the children, the biological father, his wage etc. in one panel data set (long format). So far, I have created two separate panel data files, the master file with numerous selected variables from _indresp and the using file from _child with a number of selected variables. So I have these two panel sets and I want to merge them. I went ahead and ran
merge 1:1 pidp wave using ........, gen(....). In the master set, I have roughtly 290 000 observations and in the using set 9 000. The number of matches were zero, though. My guess is that I cannot use pidp as common variable for both sets, but I am not sure which one to use. If you could give me a hint as to how to set this merging procedure up, I would highly appreciate that.
Updated by Alita Nandi almost 8 years ago
- Status changed from New to In Progress
- Assignee changed from Alita Nandi to Nico Ochmann
- % Done changed from 0 to 90
Dear Nico,
The pidp in w_child file is the pidp of the child (< 16 year olds). The pidp in w_indresp is the pidp of the adult respondents (16+ year olds). So, if you merge these two files on pidp they will never match. What you need to do is identify the PIDP of the parents of the children in w_child: w_fnspid w_mnspid in w_indall. Then rename these variables to PIDP and then merge that with w_indresp using PIDP. Note you will have to do this separately for the father and mother.
Best wishes,
Alita
Updated by Nico Ochmann almost 8 years ago
Thank you very much once again!
Just as a note, it seems to me that I need to merge these two files using PIDP and wave.
Cheers!
Nico
Updated by Alita Nandi almost 8 years ago
You can either
(i) do the merge exercise separately for each wave, save the final file as a wave specific file and then append them, or
(ii) produce a long file for each type of file and then merge them together - using pidp and wave
Updated by Nico Ochmann almost 8 years ago
Dear Dr Nandi,
I do have a short follow up question before you close this one. In the cild file I have 90,000 obs. I end up with only 60,000 matches when I merge the files.
I cannot see why I would have such a large number of mismatches. If you have any ideas, I would appreciate a short comment.
Thanks a lot!
Nico
Result # of obs.
-----------------------------------------
not matched 289,433
from master 259,370 (_wemerge==1)
from using 30,063 (_wemerge==2)
matched 60,664 (_wemerge==3)
-----------------------------------------
global Stata11_se "F:\UnderstandingSocietyData\Data\stata\stata11_se"
global data "P:\Mergeddata_master2016_religion"
global dofiles "P:\Dofiles_master2016_religion"
global logfiles "P:\Logfiles_master2016_religion"
foreach w in a b c d e f{
use `w'_ukborn pidp fpid mpid `w'_hidp `w'_pno `w'_istrtdaty `w'_plbornc `w'_jbhrs `w'_qfhigh `w'_qualoc `w'_sex `w'_dvage `w'_marstat `w'_fimngrs_dv `w'_fimnlabgrs_dv `w'_jbstat `w'_plbornc_all `w'_pacob_all `w'_macob_all ///
`w'_oprlg1 `w'_oprlg0 ///
`w'_paygu_dv `w'_oprlg `w'_yr2uk4 `w'_birthy `w'_feend `w'_scend `w'_jbterm1 `w'_jbsize `w'_jbsect `w'_jbsectpub `w'_racel `w'_jbbgy `w'_sf1 `w'_jbmngr `w'_lnprnt `w'_jbsemp ///
`w'_gor_dv `w'_urban_dv `w'_jshrs ///
using "$Stata11_se/`w'_indresp_protect", clear
gen wave = strpos("abcdef","`w'")
renpfix `w'_
save $data/`w'wave, replace
}
foreach w in a b c d e {
append using $data/`w'wave.dta
}
sort wave pidp
rename pidp PIDP
rename dvage age
save $data/abcdef_long, replace
foreach w in a b c d e f{
erase $data/`w'wave.dta
}
foreach w in a b c d e f{
use pidp `w'_hidp `w'_sex `w'_dvage `w'_hgbiom `w'_hgbiof `w'_adresp15 `w'_birthy `w'_birthm fpid mpid `w'_fnpno `w'_fnspid `w'_mnpno `w'_mnspid ///
using "$Stata11_se/`w'_child_protect", clear
gen wave = strpos("abcdef","`w'")
renpfix `w'_
save $data/`w'wave, replace
}
foreach w in a b c d e {
append using $data/`w'wave.dta
}
sort wave pidp
rename fnspid PIDP
rename dvage childage
rename sex childgender
save $data/abcdef_long_2, replace
foreach w in a b c d e f{
erase $data/`w'wave.dta
}
use $data/abcdef_long, clear
merge 1:m PIDP wave using $data/abcdef_long_2, generate (_wemerge)
Updated by Alita Nandi almost 8 years ago
A quick look at your syntax suggests that you have only matched with fathers and so are missing all children living in households without fathers.
Best wishes,
Alita
Updated by Victoria Nolan almost 8 years ago
- Status changed from In Progress to Closed
- % Done changed from 90 to 100