Unmatched households when merging hhresp with lsoa data
Hi I am trying to distribute LSOA identifiers to household level data using wave 3. I have an issue when I merge the c_hhresp file with the c_lsoa11_protect file using the below stata code:
use c_hhresp.dta, clear
merge m:1 c_hidp using c_lsoa11_protect.dta
When performing this command I find 7 households from the hhresp file which are not found in the LSOA file, and 6,507 households from the LSOA file not found in the hhresp file.
I haven't been able to identify a reason from looking at the data as to why these data are only found in one file, rather than both. I was wondering if you could confirm if the above merge is correct and if so, what the reason is for these data which cannot be matched?
Updated by Gundi Knies 7 months ago
- Category set to Special license
- Assignee set to Natalie Bennett
- Target version set to X M
- % Done changed from 0 to 80
- Private changed from Yes to No
the universe of cases in the geographical identifier datasets is sampled households with a valid postcode on the ONS Postcode Directory, see the user guidance accompanying the file. Not all sampled households have a valid postcode, and not all sampled households participate in the household interview. The hhresp data file only has households with a household interview record.
Hope this explains the non-matches you find.
Updated by Alita Nandi 7 months ago
- Status changed from New to Feedback
To follow up on what Gundi has said,
All households in c_hhsamp who have a valid postcode for their addresses are available in c_lsoa11_protect file. c_hhresp comprises of only those sampled households who have completed a household questionnaire, so the 6507 cases in c_lsoa11_protect are those households who are present in c_hhsamp but not in c_hhresp. The 7 cases in c_hhresp but no in c_lsoa11_protect are households without a valid postcode.
You can check this by merging c_hhsamp with c_lsoa11_protect using c_hidp. You will find that there are no cases in the c_lsoa11_protect which are not in c_hhsamp, but there are 2898 cases in c_hhsamp which are not in c_lsoa11_protect.
Note, c_hhsamp c_hhresp c_lsoa11_protect are all at c_hidp level, so m:1 merging is not needed. You can use 1:1 merging.
On behalf of Understanding Society User Support Team