Support #1081: Youth and individual respondents datasets - merging info - Understanding Society User Support

Custom queries

userforum_asset
userforum_debt
userforum_finance
userforum_wealth

Actions

Copy link

Support #1081

open

Youth and individual respondents datasets - merging info

Added by Theodora Kokosi over 7 years ago. Updated over 2 years ago.

Status:

Resolved

Priority:

High

Assignee:

Category:

Data management

Start date:

10/26/2018

% Done:

100%

Description

Dear all,

I would like to merge data from the "indresp" file into the youth file. Which would be the best way to do that?

I am assuming that using the pidp as a key variable is not ideal since they are different respondents and their cases wouldn't match. Is the household identifier a better solution?

To be more specific, I would like to use the variable for the maternal highest qualification as a covariate in models using data from the youth questionnaire.

Thank you in advance.

Kind regards,
Dora

History
Notes
Property changes

Actions

Copy link

Updated by Stephanie Auty over 7 years ago

Status changed from New to In Progress
Assignee set to Stephanie Auty
% Done changed from 0 to 10
Private changed from Yes to No

Many thanks for your enquiry. The Understanding Society team is looking into it and we will get back to you as soon as we can.

Best wishes,
Stephanie Auty - Understanding Society User Support Officer

Actions

Copy link

Updated by Stephanie Auty over 7 years ago

Status changed from In Progress to Feedback
Assignee changed from Stephanie Auty to Theodora Kokosi
% Done changed from 10 to 80

Dear Dora,

The w_youth files contain the mother's ID in the variables w_mnpid (for natural mothers) and w_mnspid (natural, step and adoptive mothers). This is the variable which will match the pidp in w_indresp.

The simplest way to match the mother's highest qualification into the youth file would be to take the highest qualification and pidp from w_indresp, rename pidp to w_mnpid or w_mnspid depending on which you want to use, then merge with w_youth using w_mn(s)pid as the merge varaible.

Best wishes,
Stephanie

Actions

Copy link

Updated by Theodora Kokosi over 7 years ago

Dear Stephanie,

This is really helpful. Thanks a lot!

Best wishes,
Dora

Actions

Copy link

Updated by Marina Fernandez Reino over 6 years ago

Hi,

I have a question regarding Support #1081.
I am following Stephanie's advice because I also want to merge mother's information from indresp with the youth datafile. However, in the youth datafile there are duplicates of mother's id because there are sometimes more than 1 children interviewed in each household. When I try to merge it I get an error saying h_hidp h_mnspid do not uniquely identify observations in the master data (i.e. youth data file). What should I do?
Thanks

Theodora Kokosi wrote:

Dear all,

I would like to merge data from the "indresp" file into the youth file. Which would be the best way to do that?

I am assuming that using the pidp as a key variable is not ideal since they are different respondents and their cases wouldn't match. Is the household identifier a better solution?

To be more specific, I would like to use the variable for the maternal highest qualification as a covariate in models using data from the youth questionnaire.

Thank you in advance.

Kind regards,
Dora

Actions

Copy link

Updated by Gundi Knies over 6 years ago

Assignee deleted (~~Theodora Kokosi~~)

Hi Marina,
I think you might want to look up the ~~merge~~ command in Stata. You can do a m:1 or 1:m merge on mnspid. In this case, you have many youths in the youth data file who have the same mother in the indresp data file.
Hope this helps.
Gundi

Actions

Copy link

Updated by Marina Fernandez Reino over 6 years ago

Thanks, Gundi. I don't know how I didn't realised it could be done that way

Actions

Copy link

Updated by Marina Fernandez Reino over 6 years ago

Hi Gundi,
Just to make sure I am doing things right: there are 743 children who have a mother pidp identifier that cannot be matched with the mother's data from indresp because there are no such identifiers there. I assume these are non-responent mothers, aren't they?
Thanks

Actions

Copy link

Updated by Understanding Society User Support Team over 2 years ago

Category set to Data management
Status changed from Feedback to Resolved
% Done changed from 80 to 100

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

Understanding Society User Support

Custom queries

Support #1081

Youth and individual respondents datasets - merging info

Updated by Stephanie Auty over 7 years ago

Updated by Stephanie Auty over 7 years ago

Updated by Theodora Kokosi over 7 years ago

Updated by Marina Fernandez Reino over 6 years ago

Updated by Gundi Knies over 6 years ago

Updated by Marina Fernandez Reino over 6 years ago

Updated by Marina Fernandez Reino over 6 years ago

Updated by Understanding Society User Support Team over 2 years ago