Support #1733

Covid Survey - Measuring Internal Migration

Added by Morgan Ward almost 2 years ago. Updated over 1 year ago.

Data management
Start date:
% Done:




I am attempting to measure 3-monthly internal migration moves between UK regions during the Covid pandemic, using your Covid survey data.

To do this, I have created a variable to identify where movements have taken place between waves, e.g., for moves between May - July 2020:
gen regi_moves = 1 if cd_gor_dv != cb_gor_dv

However, when I longitudinally weight these regional moves using betaindin_lw, the unweighted counts of moves are significantly reduced. E.g., the unweighted count of 147 moves into the South East between May - July 2020, is reduced to a weighted count of 3.324 moves.

Is there a way of producing valid weights for the analysis of moves between Waves please, which do not have such reduced counts? I was hoping to compare sequential Waves, excluding Waves 1 and 3. So comparing Waves 2 - 4, 4 - 5, 5 - 6 etc. The reduction in counts prevents me from generalizing rates of internal migration moves with any accuracy.

Any help will be greatly appreciated.

Thank you


regi_moves.jpg (25.2 KB) regi_moves.jpg Understanding Society User Support Team, 07/25/2022 03:44 PM

Updated by Understanding Society User Support Team almost 2 years ago

  • File regi_moves.jpg regi_moves.jpg added
  • Category set to Data management
  • Status changed from New to Feedback
  • % Done changed from 0 to 80
  • Private changed from Yes to No

Dear Morgan,

I suspect that you have incorrectly calculated the number of moves. I managed to get 147 moves to South East between May and July when I ran your syntax creating regi_moves on a datafile that includes all respondents after merging cb_indresp and cd_indresp, that is, including the respondents who participated in wave 2 only (so did not participate in wave 4) and the respondents who participated in wave 4 only (so not in wave 2). For such cases regi_moves equals 1 but that's a Stata artefact (see the screenshot), not a correctly identified move. The correct total number of moves between these time points is 16 (that is, calculated on a sample where all respondents participated in both waves). As longitudinal weights are calculated only for people who continuously participated in the study, that is, in all waves you are interested in ( in this case w2, 3, and 4), when you weighted using cd_betaindin_lw Stata automatically excluded all these artificial moves from the calculation (as cd_betaindin_lw for these cases is either missing or equals 0).

I hope this helps.

Best wishes,
UKHLS User Support Team


Updated by Morgan Ward almost 2 years ago

Dear Piotr,

This was very helpful, thank you.

Best wishes,



Updated by Understanding Society User Support Team over 1 year ago

  • Status changed from Feedback to Resolved
  • % Done changed from 80 to 100

Also available in: Atom PDF