## Support #1486

### Constructing a reference group using weights

100%

**Description**

Hi, I am trying to construct an income reference group for each individual in each wave but am having issues incorporating the population weights into the Stata code. I would like each individual's reference income to be the mean income for people of the same sex, region and education level.

The code I have written is:

svyset, clear

svyset psu [pweight=g_indscus_lw], strata(strata)

svy: bysort gor_dv sex hiqual_dv: egen refgrs1 = mean(fimngrs_dv) if fimngrs_tc==0

However, Stata cannot run the last line because of the 'svy'. How I can resolve this issue?

#### Updated by Gundi Knies 8 months ago

**Status**changed from*New*to*Resolved*

Hi Leilah,

to generate the mean in a group you do not need to use the svy suite of commands.

Simply generate a new variable that contains the possible combinations of gor_dv sex hiqual_dv using commands like egen group(), and then summarize the values by this new variable applying the weights:

gen x_refgr=.

forvalues v=1(1)N {

sum x [aw=weight] if refgroup==`N'

gen x_refgr=r(mean) if refgroup==`N'

}

#### Updated by Gundi Knies 8 months ago

**Assignee**set to*Leilah Plant-Tchenguiz*

Hi Leilah,

to generate the mean in a group you do not need to use the svy suite of commands.

Simply generate a new variable that contains the possible combinations of gor_dv sex hiqual_dv using commands like egen group(), and then summarize the values by this new variable applying the weights:

gen x_refgr=.

lab var x_refgr "Mean value for reference group"

forvalues v=1(1)N {

sum x [aw=weight] if refgroup==`v'

replace x_refgr=`r(mean)' if refgroup==`v'

}

Each person in refgroup v (with values 1 through N) should then have the mean of their refgroup assigned to them in variable x_refgr.

Hope this helps,

Gundi

PS. Please ignore my previous reply. hit the wrong key and is was posted prematurely.

#### Updated by Alita Nandi 8 months ago

**% Done**changed from*0*to*100*

There are multiple methods. For example, you can also use collapse:

Open your data

save temp, replace // collapse changes the dataset, so you will have to save the existing data first

use temp, clear

collapse fimngrs_dv [pweight=g_indscus_lw] if fimngrs_tc==0, by(gor_dv sex hiqual_dv)

merge 1:m gor_dv sex hiqual_dv using temp

#### Updated by Leilah Plant-Tchenguiz 8 months ago

Hi Gundi, why do you use an aweight? I am trying to construct the mean income of a reference group weighted by population.

#### Updated by Leilah Plant-Tchenguiz 8 months ago

In addition, how would I know which population weight to choose for the mean income calculation? When calculating mean income for each reference group, I would like reference groups to consist of people of the same sex and age in the same region and year. The dataset I am using to calculate this consists of only waves 1-7. Can I use a cross-sectional weight for a pooled sample from waves 1-7, or is this not valid because I am splitting the data by year?

#### Updated by Gundi Knies 8 months ago

**Category**set to*Data analysis***Assignee**set to*Leilah Plant-Tchenguiz*

Hi Leilah,

as you are looking for the population mean, you can use summarize which allows aweight, but not pweight. Alternatively, use mean with pweight. The Stata user fora have quite a few queries and responses on this.

Regarding the reference income and which weights to use, most applications I have seen have used present (i.e., current year or quarter) or past (i.e., typically, last year or quarter) income as reference income. Assuming that you'd be looking to follow in the footsteps of that type of work, to predict incomes in a particular calendar year using Understanding Society, you'd be pooling data from two consecutive waves (e.g., W1Y2 and W2Y1 to get estimates for 2010). You'd use re-scaled cross-sectional weights for this. Please see https://iserredex.essex.ac.uk/support/issues/494 for further guidance on pooled cross-sectional analyses.

Best wishes,

Understanding Society User Support