Support #1612

Understanding length of time unemployed

Added by Chris Percy over 2 years ago. Updated 8 months ago.

Questionnaire content
Start date:
% Done:



Dear team, I hope you can help. I'm working with US W10 in Stata (file: j_indresp.dta). I'm using "j_jbstat 3" to identify those who are unemployed (+ j_jubgn 1 to match up with ILO definition of unemployment).

I would like to know how long respondents have been unemployed for at the time of survey, at least approximately.

I have tried to work this backwards from the various employment spell data, e.g. "j_nnmpsp_dv 0 & j_nmpsp_dv 1 & j_nunmpsp_dv == 1" to identify someone who has had exactly one employed and one unemployed spell since the last survey, and then look at the individual dates of employment spells starting, but such data (e.g. j_statendm1 j_statendy41) mostly come up inapplicable. Can you advise?

thank you, Chris

[PS. The survey weight set up I'm using to analyse population segment size is: svyset j_psu [pweight=j_indpxui_xw], strata(j_strata) singleunit(scaled) ]


Updated by Understanding Society User Support Team over 2 years ago

  • Status changed from New to In Progress
  • Assignee changed from Alita Nandi to Understanding Society User Support Team
  • % Done changed from 0 to 10
  • Private changed from Yes to No

Many thanks for your enquiry. The Understanding Society team is looking into it and we will get back to you as soon as we can. We aim to respond to simple queries within 48 hours and more complex issues within 7 working days.
Best wishes,
Understanding Society User Support Team


Updated by Understanding Society User Support Team over 2 years ago

  • Status changed from In Progress to Feedback
  • % Done changed from 10 to 80

The end of the first employment status since last interview is recorded in j_empstend(y/m/d). Then if the next employment status also ends that will be recorded in the above variables.

Suppose someone was employed at the last interview and said that the status changed (empchk=2), and it changed on empstend(y/m/d) to unemployment (nxtst=2, nxtstelse=1) but that spell has not ended, i.e., they are currently unemployed. Then they would have one employment spell which ended on empstend(y/m/d) followed by an unemployment spell that is still continuing.

Does this answer your question?


Updated by Chris Percy over 2 years ago

Thank you for the quick response. When I follow the approach you describe I get some values that I struggle to interpret, which I'd like to double-check before using.

For instance, if I focus just on the currently unemployed (gvnt definition; code below) and exclude those with zero weight for the target analysis, then 28% are inapplicable according to whether they were non-employed at last wave (j_notempchk; with survey weighting, excl. proxy/refusal/DK answers). 54% were still in non-employment and 18% were not (code copied at end of note). The 18% who have changed status mostly do have data in j_empstendm etc. that explains when that status changed (thank you for this pointer).

How should I interpret the results? Perhaps "inapplicable" in this case mean they were not interviewed at the last wave and should be excluded from the analysis, although maybe you can point towards a more precise indicator for this since "inapplicable" might have broader meanings elsewhere. Does this also mean that for those interviewed for the first time, we do not know when they last changed to their current economic activity status?

With around a year between interviews, the results also seem to suggest a very large proportion of the unemployed having been unemployed for at least a year, contrary to what I thought was the case around then from other statistics (e.g. Figure 6 from the Sept 2019 unemployment data here: Am I misinterpreting the numbers?

I've also struggled in a more general attempt to line up total unemployment rate to gvnt data (3.8-4.2% for the 2018 and 2019 years that cover most of the interview dates in this dataset). I can only line them up if I take the 95% high end of 95%estimate for number of unemployed (who are actively seeking work & ready to work) and divide by the 95% low end for the economically active denominator. However, the point estimate comes out at 2.8% which makes me suspect I'm doing something wrong definitionally or methodologically. Do you have any advice here? I've looked at the webinars/youtube videos and couldn't see anything on this, but please point me to any reference material that might address this topic.

Thank you for your support

Stata code for context:

// Applying survey structure/weighting
svyset j_psu [pweight=j_indpxui_xw], strata(j_strata) singleunit(scaled)

// Identifying whether the currently unemployed changed non-employment status since last interview
svy: tab j_notempchk if (j_jbstat 3 & j_julk4wk 1 & j_jubgn == 1) & j_indpxui_xw != 0 & (j_notempchk != -7 & j_notempchk != -2 & j_notempchk != -1), obs

// Among those who changed non-employment status (i.e. were presumably employed before), what month did their employment spell end
svy: tab j_empstendm if (j_jbstat 3 & j_julk4wk 1 & j_jubgn 1) & j_indpxui_xw != 0 & (j_notempchk 2), obs

// For lining up to gvnt data on overall unemployment ratio:

// Numerator:
svy: total j_indpxui_xw if (j_jbstat 3 & j_jubgn 1 & j_julk4wk == 1) & j_dvage >= 16 & j_indpxui_xw != 0

// Denominator (plus the numerator value from above)
svy: total j_indpxui_xw if (j_jbstat 1 | j_jbstat 2 | j_jbstat 9 | j_jbstat 10 | j_jbstat == 11 ) & j_dvage >= 16 & j_indpxui_xw != 0


Updated by Understanding Society User Support Team over 2 years ago

(1) About your question regarding inapplicables - anyone who is not asked the question is coded as inapplicable. You can find out who are asked the question by looking at the Universe below the question. As you can see the question "notempchk" is asked only of these people "j_ff_jbstat>3 & j_ff_jbstat<. & (j_ff_ivlolw 1 | j_ff_everint 1)"
If we check that we will see that most of these people are asked the question and that is why for this group j_notempchk is no equal to -8. However, there are 5 persons who should have been asked this question but were not and that is why j_notempchk=-8. See here:
fre j_notempchk if j_ff_jbstat>3 & j_ff_jbstat<. & (j_ff_ivlolw 1 | j_ff_everint 1)

Then if you look at those who were unemployed last wave (ff_jbstat =3 ) and were asked j_notempchk, those who say they are continuously unemployed, most of their current status is unemployed which is correct. But there are few don't give their current status as unemployed although they say they are continuously unemployed.
fre j_jbstat if j_ff_jbstat==3 & j_notempchk==1
It could be that someone is unemployed but also taking care of family and so they may have chosen the latter rather than the former.

Similarly you can do the checks the other way round,
fre j_jbstat if j_ff_jbstat==3 & j_notempchk==2

(2) If you want to estimate the proportion of UK adults who are unemployed, then use j_jbstat and the cross-sectional weights:
tab j_jbstat if j_jbstat>0 [aw=j_indinui_xw]
This shows that 3.9% of UK adults are unemployed.


Updated by Chris Percy over 2 years ago

Thank you for this reply. All clear on inapplicables and universes. In the downloaded file, "...\UKDA-6614-stata\stata\stata13_se\ukhls_w10\j_indresp.dta" I don't have j_ff_ivlolw or j_ff_everint, but otherwise I can follow your points.

For the unemployment rate, the problem arises when adopting the full definition of unemployment to line up to the gvnt data. In the link provided before, which suggests 3.9%-4.2% unemployment rate: "Unemployment measures people without a job who have been actively seeking work within the last four weeks and are available to start work within the next two weeks" as a proportion of the economically active (whereas the 3.9% you calculate below as a propn of the whole 16+ popn). To calculate this, we need "j_jubgn 1 & j_julk4wk 1", as well as j_jbstat 3. Not everyone who is j_jbstat == 3 answers the questions on availability and looking for work, but out of those who answered both of those with either a No or a Yes, 49.7% answered Yes to both. Applying this as an estimate to the proportion of people who report as unemployed but would be defined as economically inactive by the formal definitions, and then dividing by the total of employed+unemployed, I get an unemployment rate of 3.4% (3.5% if I exclude those aged 65+). This is lower than I would expect, but is now a lot closer to the actual UK rate at the time. Can you see anything I might be doing wrong here, in these attempts to triangulate the 3Y-APS data against the formal unemployment data derived from the LFS?

svyset j_psu [pweight=j_indinui_xw], strata(j_strata) singleunit(scaled)
svy:tab j_jubgn j_julk4wk if j_jbstat == 3
svy:tab j_jbstat if j_jbstat > 0


Updated by Understanding Society User Support Team almost 2 years ago

We are very sorry this slipped through.

In the syntax you have specified, you need to
(1) make sure that the negative values are recoded to system missing
(2) calculate the % of unemployed by only considering those who are active in the labour market

mvdecode _all, mv(-9/-1)

svyset j_psu [pweight=j_indinui_xw], strata(j_strata) singleunit(scaled)

generat x1=0 if j_jbstat>=1 & j_jbstat<=3
replace x1=1 if j_jbstat==3

generat x2=0 if j_jbstat>=1 & j_jbstat<=3
replace x2=1 if j_jubgn==1 & j_julk4wk==1

svy:tab x1
svy:tab x2


Updated by Chris Percy almost 2 years ago

All clear, thank you for the reply and the support. Chris


Updated by Understanding Society User Support Team almost 2 years ago

  • Status changed from Feedback to Resolved
  • % Done changed from 80 to 100

Updated by Understanding Society User Support Team 8 months ago

  • Category changed from Data analysis to Questionnaire content

Also available in: Atom PDF