Support #1080: Weights for pooled cross-sectional analysis - accounting for clustering - Understanding Society User Support

Actions

Copy link

Support #1080

open

Weights for pooled cross-sectional analysis - accounting for clustering

Added by Lewis Anderson over 6 years ago. Updated almost 2 years ago.

Status:

Resolved

Priority:

Normal

Assignee:

Lewis Anderson

Category:

Weights

Start date:

10/21/2018

% Done:

100%

Description

Dear Support Team,

This can be seen as a follow-up to #758, which presents a similar problem.

I am trying to explore the cross-sectional association between two time-varying variables (a value of interest of one of the variables is relatively rare). To do this I would like to pool data from the various waves of Understanding Society.

In comment #7 on #758 Nico Ochmann writes: "I run logrealhourlywage on x1 x2 [pw=newwgt], cluster(pidp) / Is this reasonable or am I still completely off?", to which Peter Lynn replies "Looks fine!".

However I would also like to account for the survey design by using ~~svyset~~ in Stata. ~~svyset~~ does not allow the cluster option. Is there a straightforward way around this? Or is it not possible to cluster on pidp because I am effectively already clustering on psu by specifying: svyset psu [pweight=weight_indsc_xw], strata(strata) singleunit(scaled) -- where weight_indsc_xw is a_indscus_xw from wave 1, b_indscub_xw from wave 2, etc.? Is it in fact satisfactory to cluster on the higher level (PSU) and ignore clustering within individuals at the lower level?

Or - would it be better to run this as a multilevel model, with observations clustered in individuals, individuals (in households, and households) in PSUs? According to the Stata help file for ~~mixed~~, and the parts of the Stata Reference Manual to which it refers, this raises a few difficulties with regard to sampling weights:

"...it is not sufficient to use the single sampling weight wij , because weights
enter into the log likelihood at both the group level and the individual level. Instead, what is required
for a two-level model under this sampling design is wj , the inverse of the probability that group j
is selected in the first stage, and wijj , the inverse of the probability that individual i from group j is
selected at the second stage conditional on group j already being selected."

Any help much appreciated.
Regards,
Lewis

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

Understanding Society User Support

Custom queries

Support #1080

Weights for pooled cross-sectional analysis - accounting for clustering

Updated by Stephanie Auty over 6 years ago

Updated by Peter Lynn over 6 years ago

Updated by Lewis Anderson over 6 years ago

Updated by Stephanie Auty over 6 years ago

Updated by Understanding Society User Support Team almost 2 years ago