Project

General

Profile

Support #1484

Covid-19 Survey Wave e weights

Added by JINGYA ZENG about 1 month ago. Updated about 1 month ago.

Status:
Resolved
Priority:
Normal
Assignee:
-
Start date:
01/19/2021
% Done:

100%


Description

Hi,

I would like to explore the relationship between internet use (ce_netpusenew) and mental health status (ce_scghq1_dv) using the data of wave e of covid-19 survey. The regression is carried out in Stata version 16 using the svy commands and survey weights. I use the following command:

  • replace ce_scghq1_dv=. if ce_scghq1_dv<0
  • replace ce_netpusenew=. if ce_netpusenew<0
  • svyset psu [pweight=ce_betaindin_xw], strata(strata) singleunit(centered)
  • svy: reg ce_scghq1_dv i.ce_netpusenew

The question is that the population size (10,198) is smaller than the number of observations (12,391)

I am wondering if I did something wrong. I assume that population size should be bigger than the number of observations.

Thanks for your help and looking forward to your reply.

Best wishes,
Jingya


Files

Results.png (92.8 KB) Results.png JINGYA ZENG, 01/20/2021 03:58 PM
#1

Updated by Alita Nandi about 1 month ago

  • % Done changed from 0 to 50
  • Assignee set to Alita Nandi
  • Status changed from New to Feedback

Hello,

I repeated your code and then estimated your model without weights (option 1), with weights and survey design (option 2) and with just weights (Option 3), but I did not find the same number of observations as you.

use "$m/covid19/ce_indresp_w", clear
replace ce_scghq1_dv=. if ce_scghq1_dv<0
replace ce_netpusenew=. if ce_netpusenew<0

// Option 1
reg ce_scghq1_dv i.ce_netpusenew
// Option 2
svyset psu [pweight=ce_betaindin_xw], strata(strata) singleunit(centered)
svy: reg ce_scghq1_dv i.ce_netpusenew
// Option 3
reg ce_scghq1_dv i.ce_netpusenew [pw=ce_betaindin_xw]

Option 1: No. of obs = 12403
Option 2: No. of obs = 12391
Option 3: No. of obs = 10256

These numbers is as we would expect. In option 3, Stata ignores cases with zero weights, while in Option 2 it does not. As there are cases with 0 weights, hence the difference in no. of observations for Options 2 & 3.

https://www.stata.com/support/faqs/statistics/svy-and-zero-weights/

#2

Updated by JINGYA ZENG about 1 month ago

Hi Alita,

Many thanks for your reply. Please find my result table in the attachment. The command that I mentioned gives this result, where the No. of obs is exactly the same as the Option 2 (12,391). Here is the first two lines of the result.

Number of strata = 1,587 *Number of obs = 12,391
Number of PSUs = 3,458 * Population size = 10,198.388

The question is that the estimate of population size (10,198) is smaller than the No. of obs (12,391). And it is even smaller than the No. of obs in the Option 3 (10,256).

As mentioned in the Understanding Society COVID-19 User Guide: using “cW_betaindin_xw” will provide estimates that are representative of the population of all adults (16+) who were resident in private households in the UK at the time of wave 9, and who did not die or emigrate before the relevant web survey. So I assume that the result will give a bigger estimate of population size to show the representative of the population. Did I interpret things wrong?

Looking forward to your reply.

Best wishes,
Jingya

#3

Updated by Alita Nandi about 1 month ago

Sorry, I see what you mean.

The weights are scaled so that they add up to the sample size and not the UK population, and as there are zero weights this sum is less than the number of observations.

#4

Updated by JINGYA ZENG about 1 month ago

Many thanks for your explanation! It is very helpful.

#5

Updated by Alita Nandi about 1 month ago

  • Private changed from Yes to No
  • % Done changed from 50 to 100
  • Assignee deleted (Alita Nandi)
  • Status changed from Feedback to Resolved

Also available in: Atom PDF