Support #2215
openPSU, Strata
10%
Description
I read through the documentation provided in Understanding Society web page for the data design and data collections, however, I'll need some clarifications if possible. :)
The variables _psu and _strata bring some kind of nested hierarchy across individuals, however, I'm not fully sure the order of the hierarchy. An individual in _psu equal to i will be in _strata equal to j, or the opposite, an individual on _strata equal to j will be in _psi equal to i? In plain words, do we have school, class, student with respect to the _psu, _strata and pidp.
Also, I've read that the _psu corresponds to the postal code sector, however it take values from 1 up to 50000+, hence, how can those values be interpreted as postal codes?
Best,
Ioannis
Updated by Understanding Society User Support Team about 1 month ago
- Category set to Survey design
- Status changed from New to In Progress
- Assignee changed from Understanding Society User Support Team to Olena Kaminska
- % Done changed from 0 to 10
- Private changed from Yes to No
Updated by Olena Kaminska about 1 month ago
Ioannis,
Thank you for your question. The data structure within one wave is PSU, household, person, and in a longitudinal context it is psu, person, time-point observation.
We use implicit stratification. Depending on statistical model, sometimes they can't account for stratification. Stratification gains statistical efficiency, hence making your CIs smaller. But if the model can't take them into account your results will still be valid, but conservative.
Hope this helps,
Olena
Updated by Ioannis Rotous about 1 month ago
Hi again,
I use a Bayesian model so I can include them as random effects, so I can include the PSU as a random effect, and also the STRATA as another random effect. But the STRATA is a finner or coarser representation of the data compared to PSU? From my understanding it is a coarser.
Best,
Ioannis
Updated by Olena Kaminska about 1 month ago
Strata is not nested within PSUs. It is also not used as clustering in the design. Of course you could define your clusters based on strata information, if you wish, if your research design aligns with the definition of our stratification, but more likely you would use other variables for this. In such situation you may need a cross-classified modelling.
Hope this helps,
Olena