Project

General

Profile

Support #1904

Using weights when variables have some or many "missing value" codes or NAs due to missing household level data

Added by Richard Belcher about 1 year ago. Updated 7 months ago.

Status:
Resolved
Priority:
Normal
Category:
Weights
Start date:
05/16/2023
% Done:

100%


Description

Dear Olena,

I am running a pooled cross sectional individual level analysis (waves 1-9), using cross-sectional weights, but I am worried that by removing cases where sf-12 responses are -9, my sample is no longer nationally representative.

I am selected weights in my analysis that are appropriate for how the questions leading to the variables I want to use were administered. E.g. I am using self-completion questionnaire cross-sectional weights (waves 2+) as sf-12 is my dependent variable of interest in later models (weight appropriate for each wave). After aggregating the cross wave data there are a number of cases where cases with non-zero weights have missing value codes attached to them (or are NA due to me merging in household level data which is occasionally not collected). It is understandable that errors and non-response happens during the survey process. Am I safe to assume that some are random, e.g. the lack of household interviews being undertaken is random, so it wouldn't impact the weighting removing responses without that information. I am however worried that some may not be random and there may be some demographic or regional bias to -9 codes in the sf-12 variable, which prevent my sample from being nationally representative when weighted. I have 96% of the non-zero weighted samples remaining after removing those with errors, most of the reduction (3%) comes from "sf12mcs_dv" responses with the code -9.

Thanks for your help,

All the best,

Richard

Also available in: Atom PDF