Support #585

Identifying individuals with whom an interview could not be completed

Added by Sanne Velthuis over 6 years ago. Updated over 6 years ago.

Data analysis
Start date:
% Done:



I have a question relating to respondents living in households which were contacted by an interviewer but with whom no interview was ultimately completed - either because the household as a whole could not be contacted or refused, or because the individual respondent refused to take part in an interview. I can see that the _hhsamp files include, for each household identifier (_hidp) the _outcome variable which indicates the outcome of the relevant interviewer's attempt to conduct interviews with the household members. I have merged this information (using _hidp and _month) into the _hhresp datafile which in turn I have merged with the _indresp datafile (using _hidp), and appended all five waves to create a panel. Theoretically, the resulting datafile should now include all households, and all individuals within these households) who were contacted by an interviewer at each wave, including households (and the individuals within these) with whom no interviews could be completed. I expected that the resulting datafile, when sorted on _hidp and wave, would look something like this:

_hidp wave _outcome pidp
1 1 complete 1
1 2 complete 1
1 3 no contact .

So this shows that household 1 was contacted in wave 1, that the interviewer was successful at completing the interviews, and that the household contained only one individual with the personal identifier (pidp) 1. It then shows that the same household was contacted at wave 2, again successfully completed an interview, and that the household still only contained the individual with pidp 1. Then it shows that in wave 3 household 1 was contacted again, but that no contact could be made, and as a result the interviewer was unable to identify whether pidp 1 still lived in this household and was unable to conduct an interview.

However, the data does not appear to be structured like this. Where the outcome of an attempt to contact a household was an inability to contact the household, a refusal of the household to take part, or any other unsuccessful outcome, the household identifier appears to be a unique identifier that does not occur in any other wave. As a result, it is impossible to tell whether a particular household with whom no interviews could be completed is the same as a household that was successfully interviewed the year before.

So my question is this: how does the allocation of household identifiers in the _hhsamp file work? I would have expected that all households selected for participation in wave 1 would have been given a household identifier (and upon completion of the first interview all individuals within these households a personal identifier), and that for wave 2 these households, listed under the same household identifiers, would have been allocated to an interviewer who would then try to contact them for their second interview. I would thus expect that all households who could not be contacted in wave 2 would show up in the _hhsamp file with the same identifier as they had at wave 1, with the specific outcome of the contact attempt next to that. But this does not appear to be the case. So how are household identifiers allocated? Is there a way to link a particular household that was interviewed in wave 1 to an unsuccessful attempt to interview this same household in wave 2?

Also available in: Atom PDF