Support #307: Survey design xwaves - Understanding Society User Support

Actions

Copy link

Support #307

open

Survey design xwaves

Added by Daisy McGregor over 11 years ago. Updated 16 days ago.

Status:

Resolved

Priority:

High

Assignee:

Understanding Society User Support Team

Category:

Data analysis

Start date:

09/30/2014

% Done:

100%

Description

Hi, I have constructed a confidence interval for a linear combination of point estimates across BHPS waves 1, 2 and 3. Specifically, I have calculated the share of mortgagors in arrears in each year, and then taken a simple average over these years. I would like to check my methodology for constructing the confidence interval.

First I have only specified the cluster variable psu in my survey design, rather than the strata, to allow for correlation between clusters across years. I have also used xhwght for each year but am worried this is incorrect?

Then I have used the command in stata:
svy, subpop(subpop): mean arrears, over(year)

where subpop is those with tenure==2, arrears is a 1/0 variable, and year takes the values 1991, 1992 or 1993.

I then use the lincom command to take a simple average across the point estimate from each year, which gives me confidence intervals.

Is this methodology correct? Should I be using different weights?

Thanks
Daisy

Actions

Copy link

Updated by Peter Lynn over 11 years ago

Daisy,

This all sounds reasonable to me, with one exception. The observations within clusters are not independent. In fact, they will often be the same households in each year. This method of estimating standard errors ignores that, so you will under-estimate the true standard errors. I would conceptualise your target parameter as something like the mean of mean arrears over the 3 year period, so first calculate the mean for each household in the sample (this could be the mean of 3 observations, or mean of 2, or just a sole observation), then use svy: mean on that derived variable. Just a suggestion.

Best wishes,

Peter

Actions

Copy link