Support #2116

How to select variables to create legacy benefit variables

Added by huihui song about 2 months ago. Updated about 2 months ago.

Start date:
% Done:



I want to create a variable that indicates whether an individual received legacy benefits. Legacy benefits include six components: Working Tax Credit, Child Tax Credit, Housing Benefit, Income Support, Income-based Jobseeker's Allowance and Income-related Employment and Support Allowance (ESA). However, when I try to search for variables, I find that there are multiple different variables for the same question. For example, pbnft4, nfh16, benunemp1, and ff_bentype16 all indicate whether the job seeker's allowance has been received. I'm not sure which one I should choose to show whether they receive jobseeker's allowance.

thank you very much for your help!


clipboard-202406071242-kijc7.png (15.2 KB) clipboard-202406071242-kijc7.png Understanding Society User Support Team, 06/07/2024 12:42 PM
clipboard-202406071324-zfmy2.png (18.8 KB) clipboard-202406071324-zfmy2.png Understanding Society User Support Team, 06/07/2024 01:24 PM
clipboard-202406071328-i4lze.png (18.4 KB) clipboard-202406071328-i4lze.png Understanding Society User Support Team, 06/07/2024 01:28 PM

Updated by Understanding Society User Support Team about 2 months ago


It is true that UKHLS data is very complex, due to the complexity of many areas of social life and people's circumstances. We aim to represent these as accurately as possible, which often results in a complex data structure. I recommend a methodical approach: search the questionnaire, cross-check the results with our variable search webpage, and then examine the data by producing frequencies of the variables of interest and cross-tabulating them when needed.

For instance, you could search for the phrase 'Job Seeker' in the wave 13 questionnaire PDF file ( The first hit will be the 'benbase' question 'Income: Receives core benefits.' As no universe is provided, it likely means the question is asked of everyone. Verify this by running the frequencies for this variable. In the dataset, you will see that the question is available as a series of indicator variables, with the Job Seeker’s Allowance being 'benbase2' (the number corresponds to the label in the questionnaire).

There are no -8 'inapplicable' values, so indeed the question is asked of everyone, except proxies (-7).

Next, continue your search in the PDF. The next question you will find is 'missource.' From its description in the Scripting Notes, you will learn that 'MisSource is an indicator designed to identify benefit and unearned income sources that were not recorded at the current wave but were recorded at the previous wave. A flag is created for each unearned income source if it has been coded as received in the current interview. This flag is then compared with the ff variable for that income source.'

From the variable search, you know there is also the 'ff_bentype16' variable, and you can find the 'ff_bentype' name in the scripting notes for missource. The majority of cases for whom 'm_missource16==1' indeed reported Job Seeker's Allowance in the previous wave but not in the current one.

Next, you know from the variable search that nfh16 is another variable related to this benefit and you will find it just under the missource in the questionnaire. You will learn that nfh16 is asked when missource16=1, which you can check yourself:

So, there are 8 extra respondents claiming Job Seeker's allowance not captured by benbase2.

Next, you will find ficode, but I will discuss it at the end. Then, you'll find pbnft4 and when you check the universe you will learn that this question is asked of proxies (so when benbase2==-7): If (GRIDVARIABLES.Modetype = 1) // Mode is face-to-face And If (IProxy = 1) // Able to do a proxy interview for respondent. Note, that benunemp1 you mentioned is not relevant for wave 13 as it was asked only in waves 1-5, but that also tell that you'll have to do a separate analysis of the questionnaires in these waves as this means that the way we asked about benefits differed from later waves.

To summarise, using benbase2 and nfh16 you will identify 151 respondents.

Finally, you will find ficode. The values of ficode flag different benefits in the income file for which you then check the details of the claimed benefits for each respondents such as e.g. the last amount received (you can read more about ficode and income file here: The following code will give the list of people who claim Job Seeker's allowance and for whom we have further information about that benefit (to understand why, again, see

use m_income, clear
isid pidp m_fiseq m_ficode

bysort pidp m_ficode: keep if _n==1
keep if m_ficode==16
gen jobseekerinc=1

After you run the above snippet, you will see that there are 151 people left in the dataset and when you merge them to indresp file you will see that those are the same 151 respondents we identified using benbase2 and nfh16.

As you can see, finding the variables of interest can be complex and time-consuming. For this reason, I won't be able to locate all other benefits you are interested in. However, all the necessary documentation is available in the questionnaires, on our website, and in the datasets. Hopefully, the search strategy I presented above will help you find the variables for other benefits you need.

Best wishes,
Piotr Marzec,
UKHLS User Support

Also available in: Atom PDF