Unmatched households when merging hhresp with lsoa data
Hi I am trying to distribute LSOA identifiers to household level data using wave 3. I have an issue when I merge the c_hhresp file with the c_lsoa11_protect file using the below stata code:
use c_hhresp.dta, clear
merge m:1 c_hidp using c_lsoa11_protect.dta
When performing this command I find 7 households from the hhresp file which are not found in the LSOA file, and 6,507 households from the LSOA file not found in the hhresp file.
I haven't been able to identify a reason from looking at the data as to why these data are only found in one file, rather than both. I was wondering if you could confirm if the above merge is correct and if so, what the reason is for these data which cannot be matched?
Updated by Gundi Knies 11 months ago
- Category set to Special license
- Assignee set to Natalie Bennett
- Target version set to X M
- % Done changed from 0 to 80
- Private changed from Yes to No
the universe of cases in the geographical identifier datasets is sampled households with a valid postcode on the ONS Postcode Directory, see the user guidance accompanying the file. Not all sampled households have a valid postcode, and not all sampled households participate in the household interview. The hhresp data file only has households with a household interview record.
Hope this explains the non-matches you find.
Updated by Alita Nandi 11 months ago
- Status changed from New to Feedback
To follow up on what Gundi has said,
All households in c_hhsamp who have a valid postcode for their addresses are available in c_lsoa11_protect file. c_hhresp comprises of only those sampled households who have completed a household questionnaire, so the 6507 cases in c_lsoa11_protect are those households who are present in c_hhsamp but not in c_hhresp. The 7 cases in c_hhresp but no in c_lsoa11_protect are households without a valid postcode.
You can check this by merging c_hhsamp with c_lsoa11_protect using c_hidp. You will find that there are no cases in the c_lsoa11_protect which are not in c_hhsamp, but there are 2898 cases in c_hhsamp which are not in c_lsoa11_protect.
Note, c_hhsamp c_hhresp c_lsoa11_protect are all at c_hidp level, so m:1 merging is not needed. You can use 1:1 merging.
On behalf of Understanding Society User Support Team