Support #449

Sexuality Variable

Added by Alex Best about 8 years ago. Updated almost 8 years ago.

Data documentation
Start date:
% Done:




I have downloaded understanding society's dataset and have realised the variable c_sexuor which should be in dataset "c_indresp" is not actually there. This document suggests it should be:

Please could you either send me an updated version of the dataset or let me know why the sexual orientation variable is not in the dataset.


Updated by Redmine Admin about 8 years ago

  • Category set to Data documentation
  • Target version set to M3
  • % Done changed from 0 to 50

When I downloaded the data from UKDS a minute ago it was there(?).
On behalf of the team, Jakob


Updated by Alex Best about 8 years ago

I have just redownloaded it and it works, thank you! When I download the STATA version of the dataset I get the different waves and within the waves you get lot's of different files e.g. c_income, c_indresp etc. but what do the different files represent?


Updated by Redmine Admin about 8 years ago

  • Status changed from New to In Progress
  • % Done changed from 50 to 80

The various files are described here:
The main files are indresp (adult questionnaire), hhresp (household questionnaire) , and youth (youth questionnaire).
Examples of data analysis can be found in the free materials offered by the training team (incl. online courses) here:
On behalf of the team, Jakob


Updated by Alex Best about 8 years ago

Thank you. I have just had a look at the descriptions of the waves. However, I cannot find the sexuality variable in waves 4 and 5?? In wave 3 it is c_sexuor, and I know there is no sexuality variable in waves 1 and 2. Can you please tell me if there is a sexuality variable in waves 4 and 5?


Updated by Redmine Admin about 8 years ago

  • Target version changed from M3 to X M
  • % Done changed from 80 to 90*sexuor*&t=mainstage_variable
The question was on the adult self-completion questionnaire (CASI mode) at Wave 3 and again at Wave 5. At Wave 5 only 16-21 year olds were asked. NB Wave 5 data launched yesterday.
On behalf of the team, Jakob


Updated by Alex Best about 8 years ago

Ahhh ok, thanks, so for indresp (i.e. the adult questionnaire) are the people interviewed in wave 1 different to the people interviewed in wave 2,3,4 and 5?


Updated by Redmine Admin about 8 years ago

We follow, in short, individuals and their households over time;


Updated by Alex Best about 8 years ago

Ok, but are all the adult self-completion questionnaires done with interviewers face-to-face or over the telephone or do they get sent the forms and they fill it out then send it to Understanding Society?


Updated by Alita Nandi about 8 years ago

Adults (16+) are interviewed mostly F2F (with some exceptions). Those who complete F2F interviews are also asked to fill in a self-completion questionnaire (paper, then CASI from W3).

Please see the User guide (esp. Section 2.3)

On behalf of the team.


Updated by Alex Best about 8 years ago


I have had a look at income data within indresp on stata and the majority of respondents have said inapplicable to things such as hourly pay, annual pay etc. Why is this? Do you have an income variable within indresp that accurately portrays the interviewees income without the majority choosing inapplicable or don't know?


Updated by Alita Nandi about 8 years ago

Any job related question will be inapplicable for those who do not have a job. Please refer to the specific questions in the questionnaire and look at the "Universe".

You can find questionnaires here:

Also refer to the User Guide section on imputation of income and other variables.

On behalf of the team.


Updated by Alex Best about 8 years ago

Thank you, I have read the imputation of income and other variables section. However, it doesn't say which questionnaire the wage question is asked in. I know total personal income is asked in the individual adult self completion questionnaire, but I was looking for the wage variable. Which file is this in? Also, the link you provided has 5 questionnaires in the study, but when I download the dataset, there are approximately 14 different datasets within each wave which I assumed was a questionnaire. Therefore, out of these datasets where is the wage variable found? Or hourly wage variable?



Updated by Gundi Knies about 8 years ago

Hi Alex,
the online data set documentation includes a brief description of the content of the data files which you are unsure about. You can see there that information from interviews with adults (incl. the adult self-completion) is stored in the INDRESP data files. Hence, it is always a good idea to look there in the first instance.

As to finding information in the study and information on where the variable is stored, probably the best way to do this is to use the online data documentation and search for mainstage variables. The imputed variable is called _paygu_dv so you may want to use the search term "pay" (rather than "wage"). When you use the search engine you'll get all variables that include pay in the variable name or labels, and the information is linked to the variable level view of the variable, where you see in which data file the information is stored. You will also see a wave-specific frequency table and you can click through to the questionnaire, or to the frequency table of the same information in other waves.

Have you seen our online training course materials? The course contains worksheets on how the online documentation can be used, which survey instruments we use and where the information is stored. Here is a link to the online training courses:

Hope this helps,


Updated by Alex Best almost 8 years ago

Thank you!

I was just wondering how do I find out what the "proxy" answers refer to? I assume proxy means other people in the household filled in the sexuality answer for someone else so it's a proxy and not been put in heterosexual, homosexual etc. Therefore, how do I put them into their respective categories?


Updated by Alita Nandi almost 8 years ago

Hi Alex,

Proxy interview: When a person is unable to give an interview, in some cases, their spouse or adult children, who may know the basic details about then to give the interview on their behalf. As it is someone else answering on their behalf, questions are restricted to very basic factual information. None of the self-completion questions (as these are sensitive) or any other attitudinal (subjective measures, values, beliefs etc) are asked as we don't expect someone else to be able to answer these questions on the respondents behalf.

When a question is not included in the proxy interview questionnaire, you will see that the value of that variable for the proxy respondents is -7. This means that this cannot be recoded into any of the valid response categories as this is missing data - it was not asked.

These type of data issues are explained in the training material as Gundi has suggested.

Best wishes,
On behalf of the team.


Updated by Redmine Admin almost 8 years ago

  • Status changed from In Progress to Closed
  • % Done changed from 90 to 100

Also available in: Atom PDF