Project

General

Profile

Actions

Support #2267

open

Question on Merging and Weighting with R - Understanding Society Calendar Year 2022

Added by Balsam Gharib 5 days ago. Updated about 4 hours ago.

Status:
Feedback
Priority:
High
Category:
Weights
Start date:
07/30/2025
% Done:

80%


Description

Hello,

I am conducting a comparative study on household conditions in London and the South West, using the calendar year 2022 dataset available via the UK Data Service (Open Access version). I would like to double-check that I have correctly implemented the merging and survey weighting procedures to ensure a representative sample.

My unit of analysis is the individual, but I also need to incorporate household income to so I attempted to merge the indresp and hhresp files using the below:

merged_data <- merge(individual_data, household_data, by = "lmn_hidp")

I then constructed the survey design object in R using the survey package as follows:

design <- svydesign(
id = ~lmn_psu, #this is to account for clustering
strata = ~lmn_strata, #stratification
weights = ~lmn_inding2_xw, #the only cross sectional weight I found for the main individual interview
data = mydata,
nest = TRUE
)

I would be grateful if you could confirm:

Is this the correct approach for merging and weighting when conducting individual-level analysis that includes household-level variables?

Is the use of lmn_inding2_xw appropriate for generating representative estimates for calendar year 2022?

Can I assume that the results produced using svytable() or svymean() with this design object are representative of the UK population for 2022?

As an example, I am using the following line to get the weighted sample distribution across regions (with regional_breakdown being a recode of lmn_gor_dv):

svytable(~regional_breakdown, design)

I appreciate any feedback you can provide. Thank you in advance!

Actions

Also available in: Atom PDF