Project

General

Profile

Support #1080

Weights for pooled cross-sectional analysis - accounting for clustering

Added by Lewis Anderson over 5 years ago. Updated 8 months ago.

Status:
Resolved
Priority:
Normal
Category:
Weights
Start date:
10/21/2018
% Done:

100%


Description

Dear Support Team,

This can be seen as a follow-up to #758, which presents a similar problem.

I am trying to explore the cross-sectional association between two time-varying variables (a value of interest of one of the variables is relatively rare). To do this I would like to pool data from the various waves of Understanding Society.

In comment #7 on #758 Nico Ochmann writes: "I run logrealhourlywage on x1 x2 [pw=newwgt], cluster(pidp) / Is this reasonable or am I still completely off?", to which Peter Lynn replies "Looks fine!".

However I would also like to account for the survey design by using svyset in Stata. svyset does not allow the cluster option. Is there a straightforward way around this? Or is it not possible to cluster on pidp because I am effectively already clustering on psu by specifying: svyset psu [pweight=weight_indsc_xw], strata(strata) singleunit(scaled) -- where weight_indsc_xw is a_indscus_xw from wave 1, b_indscub_xw from wave 2, etc.? Is it in fact satisfactory to cluster on the higher level (PSU) and ignore clustering within individuals at the lower level?

Or - would it be better to run this as a multilevel model, with observations clustered in individuals, individuals (in households, and households) in PSUs? According to the Stata help file for mixed, and the parts of the Stata Reference Manual to which it refers, this raises a few difficulties with regard to sampling weights:

"...it is not sufficient to use the single sampling weight wij , because weights
enter into the log likelihood at both the group level and the individual level. Instead, what is required
for a two-level model under this sampling design is wj , the inverse of the probability that group j
is selected in the first stage, and wijj , the inverse of the probability that individual i from group j is
selected at the second stage conditional on group j already being selected."

Any help much appreciated.
Regards,
Lewis

Also available in: Atom PDF