Support #1680
open
Merging: Many to one & sorting
Added by Irina Kolegova over 2 years ago.
Updated over 2 years ago.
Category:
Data linkage and consents
Description
Hello,
I wanted to merge youth datasets for Wave 10 (UKHLS), Wave 4 (COVID) and Wave 8 (COVID).
1) Should I use "many to one" merging, "one to one", or "one to many"?
2) Do I need to sort datasets by pidp (and pidp_c) before merging them?
3) How can I unite those variables that repeat themselves in every wave (sex, ethnicity etc.)?
Thank you
- Private changed from Yes to No
Many thanks for your enquiry. The Understanding Society team is looking into it and we will get back to you as soon as we can.
We aim to respond to simple queries within 48 hours and more complex issues within 7 working days.
Best wishes,
Understanding Society User Support Team
- Status changed from New to In Progress
- Status changed from In Progress to Feedback
- % Done changed from 0 to 50
Dear Irina,
1) all these files are individual level files in which pidp is a unique identifier, so this is 1:1 merge,
2) if you are using Stata version 12 or older then you need to sort first, if newer versions of Stata then you do not need to sort,
3) in wide format you by definition get extra variables preceded by a wave prefix (a_ - wave 1, b_ - wave 2 and so on) to accommodate additional time points. However, for time invariant variables (e.g. country of birth, sex, ethnicity) the value of these will be the same. One workaround is to omit these variables when merging and add them after the merge from the xwavedat file (for more information about xwavedat see https://www.understandingsociety.ac.uk/documentation/mainstage/user-guides/main-survey-user-guide/list-of-data-files-and-their-descriptions). Additionally, for sex specifically, in each youth datafile there is W_ypsex which contains the answers to the YPSEX question from the youth questionnaire. Although this would be rare, W_ypsex may vary between waves. We leave the choice which of these to use, d_sex or d_ypsex to users.
Best wishes,
Understanding Society User Support Team
- Status changed from Feedback to Resolved
- % Done changed from 50 to 100
Also available in: Atom
PDF