Support #2310
openWhy cannot merge indall datasets with hhresp datasets on 2000-3000 obs?
50%
Description
Hi team,
Thanks very much for your help!
I have cleaned hhresp datasets and indall datasets and tried to merge them on hipd variable. My codes see below
global wave a b c d e f g h i j k l m n o
foreach var of global wave {
use "$input_data/`var'_treatment.dta",clear
//This is a dataset cleaned based on indall , I kept key variabels I need and did not drop or add any obs.
merge m:1 hidp using "$input_data/`var'_income and poverty.dta"
// This is cleaned based on hhresp . I did not delete or add any obs.
drop _merge
}
Then, Stata returned that around 2000-3000 ind obs (in ind-level datasets) cannot be matched with any household obs in the hh-level datasets. It happened for each wave, except for wave 1.
I am wondering did I merge them wrongly or would you mind providing any inputs on why this happen?
Appreciate all your wonderful help!
Best,
Bing
Updated by Understanding Society User Support Team 4 days ago
- Category changed from Data inconsistency to Data documentation
- Status changed from New to Feedback
- % Done changed from 0 to 50
- Private changed from Yes to No
Hello Bing,
Individuals who do not match a household are those who did not respond to a household questionnaire and/or were only enumerated. This can be identified by merging with the hhsamp dataset and inspecting the variable ivfho.
For example, in wave 6 (f), there are 2,214 individuals with no corresponding record in hhresp. After merging with hhsamp, 63% (1,393) are classified as f_ivfho = 14 (household grid + individual interview only; no household questionnaire), while the remaining 37% (821) are classified as f_ivfho = 16 (household grid only).
I hope this information is helpful
Best wishes,
Roberto Cavazos
Understanding Society User Support Team