Support #1680: Merging: Many to one & sorting - Understanding Society User Support

Actions

Copy link

Support #1680

open

Merging: Many to one & sorting

Added by Irina Kolegova almost 4 years ago. Updated almost 4 years ago.

Status:

Resolved

Priority:

Urgent

Assignee:

Irina Kolegova

Category:

Data linkage and consents

Start date:

04/08/2022

% Done:

100%

Description

Hello,

I wanted to merge youth datasets for Wave 10 (UKHLS), Wave 4 (COVID) and Wave 8 (COVID).
1) Should I use "many to one" merging, "one to one", or "one to many"?
2) Do I need to sort datasets by pidp (and pidp_c) before merging them?
3) How can I unite those variables that repeat themselves in every wave (sex, ethnicity etc.)?

Thank you

Actions

Copy link

Updated by Understanding Society User Support Team almost 4 years ago

Private changed from Yes to No

Many thanks for your enquiry. The Understanding Society team is looking into it and we will get back to you as soon as we can.

We aim to respond to simple queries within 48 hours and more complex issues within 7 working days.

Best wishes,
Understanding Society User Support Team

Actions

Copy link

Updated by Understanding Society User Support Team almost 4 years ago

Status changed from New to In Progress

Actions

Copy link

Updated by Understanding Society User Support Team almost 4 years ago

Status changed from In Progress to Feedback
% Done changed from 0 to 50

Dear Irina,

1) all these files are individual level files in which pidp is a unique identifier, so this is 1:1 merge,
2) if you are using Stata version 12 or older then you need to sort first, if newer versions of Stata then you do not need to sort,
3) in wide format you by definition get extra variables preceded by a wave prefix (a_ - wave 1, b_ - wave 2 and so on) to accommodate additional time points. However, for time invariant variables (e.g. country of birth, sex, ethnicity) the value of these will be the same. One workaround is to omit these variables when merging and add them after the merge from the xwavedat file (for more information about xwavedat see https://www.understandingsociety.ac.uk/documentation/mainstage/user-guides/main-survey-user-guide/list-of-data-files-and-their-descriptions). Additionally, for sex specifically, in each youth datafile there is W_ypsex which contains the answers to the YPSEX question from the youth questionnaire. Although this would be rare, W_ypsex may vary between waves. We leave the choice which of these to use, d_sex or d_ypsex to users.

Best wishes,
Understanding Society User Support Team

Actions

Copy link

Updated by Understanding Society User Support Team almost 4 years ago

Status changed from Feedback to Resolved
% Done changed from 50 to 100

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

Understanding Society User Support

Custom queries

Support #1680

Merging: Many to one & sorting

Updated by Understanding Society User Support Team almost 4 years ago

Updated by Understanding Society User Support Team almost 4 years ago

Updated by Understanding Society User Support Team almost 4 years ago

Updated by Understanding Society User Support Team almost 4 years ago