Father's social class
Hi, I'm interested in including father's social class in my analysis but I've noticed that there are a large amount of missing data for this variable. I looked at a previous response on this forum concerning this and it advised using the cross-wave data (xwavedat). However, there are a large amount of missing data here too - for example if I merge the cross wave data into Wave R, 40% of respondents show as "not applicable". I've also tried adding in the data from all of the previous waves, but this left a similar number of 'NAs'.
During this process, I noticed that the questions about father's social class were not asked in waves B-G. I thought that maybe the reason for the large number of missings is that people who joined the sample between waves B-G were never asked the question in subsequent waves. Is this the case?
Updated by Redmine Admin over 7 years ago
- % Done changed from 0 to 50
A bit of history ...PAJU and PASOC were asked at Wave 1 and again at Wave 8 of new entrants since Wave 1 and at Wave 9 for the new Scottish and Welsh regional boost sample members.
PASOC was not asked of those who were fatherless or whose father did not work, etc., when they were aged 14 years (cf. PAJU). It was not asked in Northern Ireland nor of those leaving the survey between Waves 1-8. Nor was it asked of rising 16-year olds. For the latter group it should however be possible to recover the information for cohabitating fathers in the previous waves.
Updated by Robert de Vries over 7 years ago
Thanks Jakob. I'm still a bit confused though. In BHPS Wave R, 230 people (1.6% of the sample) were rising 16 years olds and 2,206 (15.30% of the sample) were Northern Ireland respondents.
Added to this, merging xwavedat into BHPS Wave R tells me that 1,476 people (about 10% of the Wave R sample) had fathers who were not working or absent, or who did not know whether their father worked or not at age 14. Taken together this makes 3,912 people (or about 27% of the sample) who should be 'NA' for the PASOC question. In the actual data, 5,763 people (40% of the sample) are recorded as 'NA'. I'm probably missing something, but what explains the discrepancy?