Support #1692

Data Query

Added by Stewart Dunlop 9 months ago. Updated 8 months ago.

Data analysis
Start date:
% Done:




I'm a 2nd Year Ph.D at Glasgow University, researching happiness. I've been trying to replicate and then update a 2003 study by Andrew Clark:

"Unemployment as a Social Norm: Psychological Evidence from Panel Data"
Author(s): Andrew E. Clark
Source: Journal of Labor Economics , Vol. 21, No. 2 (April 2003), pp. 323-351.

This paper uses the 1st 7 waves of the BHPS to develop a model explaing whether happiness among the unemployed is affected by livong in an area of high unemployment.

My issue concerns the sample size. Clark (p327) clearly states that his sample size from thev 1st 7 waves is 39,477 observations. However, when I download the 1991-97 data, I get 42,244. These figures refer to individuals aged between 16-64 who are active in the labour force.

I have tried to reconcile these two totals, so far unsuccessfully. It is not due to proxies (which Clark MAY have avoided using, athough he doesn't state this). Neither is it due to diiferent definitions of the labour force. I've used the BHPS variable jbstat and Clark apears to have also done so.

My supervior has suggested that there may have been an exercise where early BHPS waves were recalibrated, perhsps when paper copies were digitised? However, I can't find evidence of this in any of the guides.

I would be very grateful if anyone could provide assistance with this question.


Stewart Dunlop.


Updated by Understanding Society User Support Team 9 months ago

  • % Done changed from 0 to 80

Based on the numbers it seems they have excluded telephone and proxy interviews and the ECHP sample that was temporarily added to BHPS from Waves 7-11. This is not explicitly mentioned in the data section of the paper where this sample size 39477 is mentioned.

By appending data from the first 7 wave of BHPS we get 69,070 person-year observations. Then restricting the sample to those who are active in the labour force and aged 16-65 years but are not in the ECHP sample and excluding telephone and proxy we get 39477
cou if age>=16 & age<=65 & inlist(jbstat,1,2,3) & hhorig==3 & ivfio==1

Hope this helps. If you have further questions please let us know.

Best wishes,
Understanding Society User Support Team.


Updated by Understanding Society User Support Team 9 months ago

  • Private changed from Yes to No

Updated by Understanding Society User Support Team 9 months ago

  • Status changed from New to Feedback

Updated by Understanding Society User Support Team 8 months ago

  • Status changed from Feedback to Resolved
  • % Done changed from 80 to 100

Also available in: Atom PDF