Support #1011

merging main dataset (6933) with local authority districts (6666) using for both data sets all waves a-g

Added by Nico Ochmann over 5 years ago. Updated 8 months ago.

Data management
Start date:
% Done:



Hello Alita,

hope you are fine. I do have a question with regard to merging the above files. It is my understanding that in the districts (6666) data set I have household number and the variable for district in each wave file (in addition I generate the wave variable for a through g). I am not sure how to proceed. Do I produce and append all seven waves for the district data set first and then merge it with the main data set or do I merge each district data set (of each wave) individually with my main data set? If I appended all seven waves of the district data set first, I would suggest doing the following:

merge m:1 hidp wave using districts_appended_a-g

Now the thing that is not clear to me is that if I have two people with two different pidp's living in the same household (hence sharing the same hidp) in a given wave, how can they be uniquely identified in the using data set?

Once again, I would highly appreciate your suggestions.

Best wishes.



Updated by Stephanie Auty over 5 years ago

  • Status changed from New to In Progress
  • % Done changed from 0 to 10
  • Private changed from Yes to No

Many thanks for your enquiry. The Understanding Society team is looking into it and we will get back to you as soon as we can.

Best wishes,
Stephanie Auty - Understanding Society User Support Officer


Updated by Alita Nandi over 5 years ago

  • Assignee changed from Alita Nandi to Nico Ochmann

Hi Nico,

I am very sorry for the delay in responding. So, your objective is to have an individual level dataset in long format with district level information attached. In that case,

(1) The wave specific district files you download will have: w_hidp district_id
(2) I am guessing you have another district level data source which you will merge in with the above file using district_id. Make sure to drop the district_id from this file. So, this file will have w_hidp and some district level variables
(3) The individual level wave specific files should include pidp w_hidp. Then you can merge this file (for each wave separately) with the file produced in step (2) using w_hidp and then append all 7 wave specific files.

Best wishes,


Updated by Nico Ochmann over 5 years ago

Dear Alita,

no worries about the delay, I had a number of other things to do anyhow. At any rate, I do not have step (2) because I do not entertain another district level data source.

But I will go ahead and do what you suggested in step (3).

Thanks a lot.

Have a great week.



Updated by Alita Nandi over 5 years ago

Why do you need the district codes if you are not going to match district level information? Are you planning to produce some district level averages?


Updated by Nico Ochmann over 5 years ago

I appreciate your interest.

I want to generate hundreds of local districts fixed effects to control for local labor market demand and supply side shocks.




Updated by Alita Nandi over 5 years ago

I see - yes, in that case ignore step2


Updated by Stephanie Auty over 5 years ago

  • Status changed from In Progress to Feedback
  • % Done changed from 10 to 80

Updated by Stephanie Auty over 5 years ago

  • Status changed from Feedback to Resolved
  • % Done changed from 80 to 100

Updated by Understanding Society User Support Team 8 months ago

  • Category changed from Data analysis to Data management

Also available in: Atom PDF