Project

General

Profile

Actions

Support #1081

open

Youth and individual respondents datasets - merging info

Added by Theodora Kokosi about 6 years ago. Updated about 1 year ago.

Status:
Resolved
Priority:
High
Assignee:
-
Category:
Data management
Start date:
10/26/2018
% Done:

100%


Description

Dear all,

I would like to merge data from the "indresp" file into the youth file. Which would be the best way to do that?

I am assuming that using the pidp as a key variable is not ideal since they are different respondents and their cases wouldn't match. Is the household identifier a better solution?

To be more specific, I would like to use the variable for the maternal highest qualification as a covariate in models using data from the youth questionnaire.

Thank you in advance.

Kind regards,
Dora

Actions #1

Updated by Stephanie Auty about 6 years ago

  • Status changed from New to In Progress
  • Assignee set to Stephanie Auty
  • % Done changed from 0 to 10
  • Private changed from Yes to No

Many thanks for your enquiry. The Understanding Society team is looking into it and we will get back to you as soon as we can.

Best wishes,
Stephanie Auty - Understanding Society User Support Officer

Actions #2

Updated by Stephanie Auty about 6 years ago

  • Status changed from In Progress to Feedback
  • Assignee changed from Stephanie Auty to Theodora Kokosi
  • % Done changed from 10 to 80

Dear Dora,

The w_youth files contain the mother's ID in the variables w_mnpid (for natural mothers) and w_mnspid (natural, step and adoptive mothers). This is the variable which will match the pidp in w_indresp.

The simplest way to match the mother's highest qualification into the youth file would be to take the highest qualification and pidp from w_indresp, rename pidp to w_mnpid or w_mnspid depending on which you want to use, then merge with w_youth using w_mn(s)pid as the merge varaible.

Best wishes,
Stephanie

Actions #3

Updated by Theodora Kokosi about 6 years ago

Dear Stephanie,

This is really helpful. Thanks a lot!

Best wishes,
Dora

Actions #4

Updated by Marina Fernandez Reino over 5 years ago

Hi,

I have a question regarding Support #1081.
I am following Stephanie's advice because I also want to merge mother's information from indresp with the youth datafile. However, in the youth datafile there are duplicates of mother's id because there are sometimes more than 1 children interviewed in each household. When I try to merge it I get an error saying h_hidp h_mnspid do not uniquely identify observations in the master data (i.e. youth data file). What should I do?
Thanks

Theodora Kokosi wrote:

Dear all,

I would like to merge data from the "indresp" file into the youth file. Which would be the best way to do that?

I am assuming that using the pidp as a key variable is not ideal since they are different respondents and their cases wouldn't match. Is the household identifier a better solution?

To be more specific, I would like to use the variable for the maternal highest qualification as a covariate in models using data from the youth questionnaire.

Thank you in advance.

Kind regards,
Dora

Actions #5

Updated by Gundi Knies over 5 years ago

  • Assignee deleted (Theodora Kokosi)

Hi Marina,
I think you might want to look up the merge command in Stata. You can do a m:1 or 1:m merge on mnspid. In this case, you have many youths in the youth data file who have the same mother in the indresp data file.
Hope this helps.
Gundi

Actions #6

Updated by Marina Fernandez Reino over 5 years ago

Thanks, Gundi. I don't know how I didn't realised it could be done that way

Actions #7

Updated by Marina Fernandez Reino over 5 years ago

Hi Gundi,
Just to make sure I am doing things right: there are 743 children who have a mother pidp identifier that cannot be matched with the mother's data from indresp because there are no such identifiers there. I assume these are non-responent mothers, aren't they?
Thanks

Actions #8

Updated by Understanding Society User Support Team about 1 year ago

  • Category set to Data management
  • Status changed from Feedback to Resolved
  • % Done changed from 80 to 100
Actions

Also available in: Atom PDF