Support #2310
openWhy cannot merge indall datasets with hhresp datasets on 2000-3000 obs?
50%
Description
Hi team,
Thanks very much for your help!
I have cleaned hhresp datasets and indall datasets and tried to merge them on hipd variable. My codes see below
global wave a b c d e f g h i j k l m n o
foreach var of global wave {
use "$input_data/`var'_treatment.dta",clear
//This is a dataset cleaned based on indall , I kept key variabels I need and did not drop or add any obs.
merge m:1 hidp using "$input_data/`var'_income and poverty.dta"
// This is cleaned based on hhresp . I did not delete or add any obs.
drop _merge
}
Then, Stata returned that around 2000-3000 ind obs (in ind-level datasets) cannot be matched with any household obs in the hh-level datasets. It happened for each wave, except for wave 1.
I am wondering did I merge them wrongly or would you mind providing any inputs on why this happen?
Appreciate all your wonderful help!
Best,
Bing
Updated by Understanding Society User Support Team 24 days ago
- Category changed from Data inconsistency to Data documentation
- Status changed from New to Feedback
- % Done changed from 0 to 50
- Private changed from Yes to No
Hello Bing,
Individuals who do not match a household are those who did not respond to a household questionnaire and/or were only enumerated. This can be identified by merging with the hhsamp dataset and inspecting the variable ivfho.
For example, in wave 6 (f), there are 2,214 individuals with no corresponding record in hhresp. After merging with hhsamp, 63% (1,393) are classified as f_ivfho = 14 (household grid + individual interview only; no household questionnaire), while the remaining 37% (821) are classified as f_ivfho = 16 (household grid only).
I hope this information is helpful
Best wishes,
Roberto Cavazos
Understanding Society User Support Team