Support #176
closedMissing parental data
100%
Description
Hi,
I'm merging data from the USoc Wave 1 youth questionnaire (a_youth) and the Wave 1 adult individual questionnaire (a_indresp) so that I can consider the role of parents' characteristics in my analysis of young people's occupational aspirations. I'm using the variables a_mnspid (to identify mothers) and a_fnspid (to identify fathers) in a_youth and the variable pidp in a_indresp to match young people with their parents. I'm using the /TABLE command in SPSS to carry out a one-to-many match so that siblings in a_youth (who have the same value for a_mnspid or a_fnspid) all receive parental data from a_indresp.
After the matching there is quite a lot of missing data. 360 cases have missing data on mothers: 190 of these are coded -8 for a_mnspid ("natural/adoptive/step mother not in household"), and the remaining 170 have a value for a_mnspid which does not match any of the cross-wave person identifiers (pidp) in a_indresp.
Likewise, 1840 cases have missing data on fathers. 1349 of these are coded -8 for a_fnspid ("natural/adoptive/step father not in household"), and the remaining 491 have a value for a_fnspid which does not match any of the cross-wave person identifiers (pidp) in a_indresp.
My three questions are:
1) How should I interpret a value of -8 for a_mnspid or a_fnspid?
2) Why are there parent identifiers in the youth dataset which do not exist in the main adult dataset?
3) Is there a better way to go about matching data on young people with data on their parents?
Thanks for your help,
Sam