Project

General

Profile

Support #1368

Unmatched households when merging hhresp with lsoa data

Added by Natalie Bennett almost 4 years ago. Updated over 2 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
-
Category:
Special license
Start date:
06/23/2020
% Done:

100%


Description

Hi I am trying to distribute LSOA identifiers to household level data using wave 3. I have an issue when I merge the c_hhresp file with the c_lsoa11_protect file using the below stata code:

use c_hhresp.dta, clear
merge m:1 c_hidp using c_lsoa11_protect.dta

When performing this command I find 7 households from the hhresp file which are not found in the LSOA file, and 6,507 households from the LSOA file not found in the hhresp file.

I haven't been able to identify a reason from looking at the data as to why these data are only found in one file, rather than both. I was wondering if you could confirm if the above merge is correct and if so, what the reason is for these data which cannot be matched?

Many thanks

Natalie

#1

Updated by Gundi Knies almost 4 years ago

  • Category set to Special license
  • Assignee set to Natalie Bennett
  • Target version set to X M
  • % Done changed from 0 to 80
  • Private changed from Yes to No

Hi Natalie,
the universe of cases in the geographical identifier datasets is sampled households with a valid postcode on the ONS Postcode Directory, see the user guidance accompanying the file. Not all sampled households have a valid postcode, and not all sampled households participate in the household interview. The hhresp data file only has households with a household interview record.

Hope this explains the non-matches you find.
Best wishes,
Gundi

#2

Updated by Alita Nandi almost 4 years ago

  • Status changed from New to Feedback

To follow up on what Gundi has said,

All households in c_hhsamp who have a valid postcode for their addresses are available in c_lsoa11_protect file. c_hhresp comprises of only those sampled households who have completed a household questionnaire, so the 6507 cases in c_lsoa11_protect are those households who are present in c_hhsamp but not in c_hhresp. The 7 cases in c_hhresp but no in c_lsoa11_protect are households without a valid postcode.

You can check this by merging c_hhsamp with c_lsoa11_protect using c_hidp. You will find that there are no cases in the c_lsoa11_protect which are not in c_hhsamp, but there are 2898 cases in c_hhsamp which are not in c_lsoa11_protect.

Note, c_hhsamp c_hhresp c_lsoa11_protect are all at c_hidp level, so m:1 merging is not needed. You can use 1:1 merging.

Best wishes,
Alita
On behalf of Understanding Society User Support Team

#3

Updated by Natalie Bennett almost 4 years ago

Hi Alita and Gundi

Thanks both for your help, I understand the source of the non-matches now.

Thanks so much.

Best wishes

Natalie

#4

Updated by Natalie Bennett almost 4 years ago

Sorry just to follow up so I can justify this later, could you confirm why those 2,898 cases are not in the LSOA file?

Many thanks

Natalie

#5

Updated by Alita Nandi almost 4 years ago

Hi Natalie,

These are the households without valid postcodes.

Best wishes,
Alita

#6

Updated by Understanding Society User Support Team over 2 years ago

  • Status changed from Feedback to Resolved
  • Assignee deleted (Natalie Bennett)
  • % Done changed from 80 to 100

Also available in: Atom PDF