Understanding Society User Support Team wrote in #note-1:
Hello Kuoshi
Are you using Stata to process your data? If so, could you share your code so I can follow along and review the numbers you’re getting?
Regarding your question about the variable "bpx_N - total number of biological parents," according to the Family Matrix (xhhrel file) User Guide (https://www.understandingsociety.ac.uk/wp-content/uploads/documentation/user-guides/6614_main_survey_user_guide_family_matrix_xhhrel.pdf) section 4.1 Identifying individuals who have specific relatives in the study, individuals with bpx_N > 0 are those whose biological parents are not part of the Study. Since the xhhrel file creates an individual-level cross-wave dataset containing familial relationship identifiers for all sample members (i.e., those ever enumerated in the Study), this indicates that for participants with bpx_N > 0, their biological parents were never included in the Study.
Additionally, you can further identify participants with no biological parents in the Study (bpx_N > 0) who may have adoptive parents (apx_N > 0) or stepparents (spx_N > 0).
I hope this information is helpful.
Best wishes,
Roberto Cavazos
Understanding Society User Support Team
Dear Roberto,
Thank you so much for your response. I will share my code for cleaning mafar in wave3 as an example.
First, I try to clean mafar data in indresp wave 3.
Code: ****mother (biological or step-) alive?
gen c_mo=0 if c_lvrel1==0
replace c_mo=0 if c_lvrel9==0
replace c_mo=1 if c_lvrel1==1
replace c_mo=1 if c_lvrel9==1
recode c_mo (.=9) //proxy or other system missing
replace c_mafar=99 if c_mo==0
tab c_mafar
gen wave3=1
Then I know 23,767 interviewees did not mention they have a mother (biological or step-) alive and not living with them (c_mafar=99).
Second, I tried to find those mothers who are not mentioned due to co-residence in egoalt.
Code:
keep if c_relationship_dv==4|c_relationship_dv==7
gen c_mother=1 if c_relationship_dv==4&c_asex==2
gen c_mother_s=1 if c_relationship_dv==7&c_asex==2
gen c_moco=1 if c_mother==1|c_mother_s==1
sort pidp apidp
bysort pidp: gen nr = _n
keep pidp c_mother c_mother_s c_moco nr
reshape wide c_mother c_mother_s c_moco, i(pidp) j(nr)
gen c_mother=1 if c_mother1==1|c_mother2==1|c_mother_s1==1|c_mother_s2==1|c_mother_s3==1|c_mother3==1
replace c_mother=0 if c_mother1==.&c_mother2==.&c_mother_s1==.&c_mother_s2==.&c_mother3==.&c_mother_s3==.
gen c_moco=1 if c_moco1==1|c_moco2==1|c_moco3==1
replace c_moco=0 if c_moco1==.&c_moco2==.&c_moco3==.
Third, I merge these two datasets, after merging:
Code: keep if wave3==1
tab c_mafar
replace c_mafar=123 if c_moco==1& c_mafar==99
tab c_mafar
Then I know 5,174 interviewees’ mother are living with them, therefore they did not mention their mothers’ information and answer this question in indresp. However, I still do not know whether the remaining 18,593 interviewees’ mothers are alive or not.
I know that the co-residence situation may be included egoalt in wave2 or wave4, there is no big change in number of interviewees did not mentioned their mothers’ information after including these coresident information from wave 2 and 4. Therefore, I am wondering why so many interviewees did not mention their mothers in this survey and whether their mothers are still alive or not.
Could you please correct me if there is something wrong with my understanding of this dataset or my coding?
Have a nice day!
Best wishes,
Kuoshi