Project

General

Profile

Support #1486

Constructing a reference group using weights

Added by Leilah Plant-Tchenguiz about 1 month ago. Updated 24 days ago.

Status:
Resolved
Priority:
Normal
Assignee:
-
Start date:
01/20/2021
% Done:

100%


Description

Hi, I am trying to construct an income reference group for each individual in each wave but am having issues incorporating the population weights into the Stata code. I would like each individual's reference income to be the mean income for people of the same sex, region and education level.

The code I have written is:
svyset, clear
svyset psu [pweight=g_indscus_lw], strata(strata)
svy: bysort gor_dv sex hiqual_dv: egen refgrs1 = mean(fimngrs_dv) if fimngrs_tc==0

However, Stata cannot run the last line because of the 'svy'. How I can resolve this issue?

#1

Updated by Gundi Knies about 1 month ago

  • Status changed from New to Resolved

Hi Leilah,
to generate the mean in a group you do not need to use the svy suite of commands.
Simply generate a new variable that contains the possible combinations of gor_dv sex hiqual_dv using commands like egen group(), and then summarize the values by this new variable applying the weights:

gen x_refgr=.
forvalues v=1(1)N {
sum x [aw=weight] if refgroup==`N'
gen x_refgr=r(mean) if refgroup==`N'
}

#2

Updated by Leilah Plant-Tchenguiz about 1 month ago

Great, thank you for your help!

#3

Updated by Gundi Knies about 1 month ago

  • Assignee set to Leilah Plant-Tchenguiz

Hi Leilah,
to generate the mean in a group you do not need to use the svy suite of commands.
Simply generate a new variable that contains the possible combinations of gor_dv sex hiqual_dv using commands like egen group(), and then summarize the values by this new variable applying the weights:

gen x_refgr=.
lab var x_refgr "Mean value for reference group"
forvalues v=1(1)N {
sum x [aw=weight] if refgroup==`v'
replace x_refgr=`r(mean)' if refgroup==`v'
}

Each person in refgroup v (with values 1 through N) should then have the mean of their refgroup assigned to them in variable x_refgr.
Hope this helps,

Gundi

PS. Please ignore my previous reply. hit the wrong key and is was posted prematurely.

#4

Updated by Leilah Plant-Tchenguiz about 1 month ago

Thanks

#5

Updated by Alita Nandi about 1 month ago

  • % Done changed from 0 to 100

There are multiple methods. For example, you can also use collapse:

Open your data
save temp, replace // collapse changes the dataset, so you will have to save the existing data first
use temp, clear
collapse fimngrs_dv [pweight=g_indscus_lw] if fimngrs_tc==0, by(gor_dv sex hiqual_dv)
merge 1:m gor_dv sex hiqual_dv using temp

#6

Updated by Alita Nandi about 1 month ago

  • Private changed from Yes to No
#7

Updated by Alita Nandi about 1 month ago

  • Assignee deleted (Leilah Plant-Tchenguiz)
#8

Updated by Leilah Plant-Tchenguiz 26 days ago

Hi Gundi, why do you use an aweight? I am trying to construct the mean income of a reference group weighted by population.

#9

Updated by Leilah Plant-Tchenguiz 26 days ago

In addition, how would I know which population weight to choose for the mean income calculation? When calculating mean income for each reference group, I would like reference groups to consist of people of the same sex and age in the same region and year. The dataset I am using to calculate this consists of only waves 1-7. Can I use a cross-sectional weight for a pooled sample from waves 1-7, or is this not valid because I am splitting the data by year?

#10

Updated by Gundi Knies 24 days ago

  • Assignee set to Leilah Plant-Tchenguiz
  • Category set to Data analysis

Hi Leilah,
as you are looking for the population mean, you can use summarize which allows aweight, but not pweight. Alternatively, use mean with pweight. The Stata user fora have quite a few queries and responses on this.

Regarding the reference income and which weights to use, most applications I have seen have used present (i.e., current year or quarter) or past (i.e., typically, last year or quarter) income as reference income. Assuming that you'd be looking to follow in the footsteps of that type of work, to predict incomes in a particular calendar year using Understanding Society, you'd be pooling data from two consecutive waves (e.g., W1Y2 and W2Y1 to get estimates for 2010). You'd use re-scaled cross-sectional weights for this. Please see https://iserredex.essex.ac.uk/support/issues/494 for further guidance on pooled cross-sectional analyses.

Best wishes,
Understanding Society User Support

#11

Updated by Leilah Plant-Tchenguiz 24 days ago

Thank you.

Also available in: Atom PDF