Understanding Society User Support: Issueshttps://iserredex.essex.ac.uk/support/https://iserredex.essex.ac.uk/support/support/favicon.ico?15995719382017-12-04T17:45:15ZUnderstanding Society User Support
Redmine Understanding Society User Support - Support #886 (Closed): Zero weights and statistical powerhttps://iserredex.essex.ac.uk/support/issues/8862017-12-04T17:45:15ZEric Emersoneric.emerson@lancaster.ac.uk
<p>Hi</p>
<p>I'm interested in data contained the harassment modules (in Waves 1, 3, 5 and 7), but am concerned about the significant reduction in statistical power arising from the increasing proportion of respondents who are assigned values of 0 in w_ind5mus_xw. I understand from a previous thread (<a class="issue tracker-3 status-3 priority-5 priority-high2" title="Support: weights for pooled cross-sections over waves (a)-(f) (Resolved)" href="https://iserredex.essex.ac.uk/support/issues/877">#877</a>) that ..... 'The provision of weights requires the ability to estimate probabilities of continuing to respond over multiple waves. This is true of cross-sectional weights as well as longitudinal ones, as they are derived from the longitudinal ones (how this was done is described in section 3.8.3.10 of the User Guide). In consequence, a person in a household where there is no person who has been enumerated at every wave up to wave w will get a weight of zero. Such people should not be given a weight, as the weights for all other sample members are calculated in a way that compensates for these "missing" people.'</p>
<p>However, the 'compensation' appears to also result in a significant loss of statistical power. Taking as base the unweighted number of respondents who provide a valid answer to the 'attacked' items, the weighted population size has reduced from 92% of actual respondents in W1 (7418/8072) to just 27% in W7 (2711/9973). The resulting reduction in power is of concern and given the rationale outlined above, will continue to increase over time as the % of households in which someone has been enumerated at every wave will continue to diminish. It also seems rather wasteful of people's time that the responses of the majority of participants is, through the weighting process, assigned to a statistical waste bin!</p>
<p>Be very grateful if you could suggest any ways round this problem.</p>
<p>Many thanks</p>
<p>Eric</p> Understanding Society User Support - Support #848 (Closed): Clinical Depression H_COND variableshttps://iserredex.essex.ac.uk/support/issues/8482017-09-04T15:55:51ZLuca Bernardiluca.bernardi@uab.cat
<p>Dear Support group,</p>
<p>I am measuring clinical depression and I would kindly need your advice on a couple of questions. I apologise sincerely for putting immediate priority on this, but your answer might also have implications for a paper I am co-authoring within the Understanding Society EU Referendum project and we have a deadline shortly for submitting the paper.</p>
<p>As I am interested in objective depression, I was using the questions H_COND17 and H_CONDS17 to create a measure of depression. What I was doing is to assign value 1 to respondents who replied that they still have depression in H_CONDS17=Yes (as I am interested in the effects of depression, I do not care much if the person was diagnosed with depression at some point in his/her life - i.e. H_COND17=Yes - but rather it is important that the person is depressed at the time of the interview). I assign value 0 if the respondent mentioned that he/she has never been diagnosed with depression in H_COND17=No.</p>
<p>So far I was using data from waves 1, and 3 to 6 as I noticed that these two variables are available in all waves but wave 2 (<a class="external" href="https://www.understandingsociety.ac.uk/documentation/mainstage/dataset-documentation/wave/2/datafile/b_indresp">https://www.understandingsociety.ac.uk/documentation/mainstage/dataset-documentation/wave/2/datafile/b_indresp</a>), where instead a slightly different question is asked: H_CONDN17. In turn, this question is not available in all waves and sometimes is asked together with the previous two questions (e.g., <a class="external" href="https://www.understandingsociety.ac.uk/documentation/mainstage/dataset-documentation/wave/4/datafile/d_indresp">https://www.understandingsociety.ac.uk/documentation/mainstage/dataset-documentation/wave/4/datafile/d_indresp</a>).</p>
<p>My questions thus are the following. Do you please know what is the reason of such a variation and, more importantly, can I "maximise" my number of depressives by creating a measure of depression that combines both sets of questions (i.e., H_COND17 and H_CONDS17, and H_CONDN17) and makes use of all available waves (i.e. 1 to 6)?</p>
<p>My idea was to do the following:</p>
<p>gen depression = .</p>
<p>replace depression = 1 if hconds17==1 | hcondn17==1</p>
<p>replace depression = 0 if hcond17==0 | hcondn17==0</p>
<p>However, I wonder how problematic can be mixing questions that are not available in all waves, as this is certainly a point that reviewers will raise. I would really appreciate your thoughts on this.</p>
<p>Many thanks and best wishes,<br />Luca</p> Understanding Society User Support - Support #513 (Closed): e_indscus_lwhttps://iserredex.essex.ac.uk/support/issues/5132016-02-26T12:35:08ZOrla McBride
<p>Hello,</p>
<p>I'm trying to use this longitudinal weight variable for analyzing data across waves 2-5 from the self-completion questionnaire. I'm trying to apply the weight variable in my analysis but given that ~21,000 cases have been assigned a value of 0, this means that the weight is viewed as missing in software such as Mplus.</p>
<p>Any recommendations for how to get around this? I was thinking that it might be possible to make the 0 values non-zero (e.g. 0.0000000000000001) and was wondering what your thoughts would be on this? Any other suggestions would be welcomed.</p>
<p>Kind regards<br />Orla</p> Understanding Society User Support - Support #506 (Closed): weights and design variables query Wa...https://iserredex.essex.ac.uk/support/issues/5062016-02-19T11:40:25ZOrla McBride
<p>Hello,</p>
<p>I was wondering if you could answer a question I have about weights and psu/strata variables for analysing Understanding Society data.</p>
<p>If I want to analyse data from the BPHS, GPS, and EMB samples from Wave 2-5, which comes from both the self-completion questionnaire and the main survey, should I use:</p>
<p>e_indscus_lw: Longitudinal adult self-completion questionnaire weight<br />e_strata: Sampling strata<br />e_psu: Primary sampling unit</p>
<p>Many thanks for your time.</p>
<p>Kind regards<br />Orla</p> Understanding Society User Support - Support #505 (Closed): Cross Sectional weights (xrwght/xrwtu...https://iserredex.essex.ac.uk/support/issues/5052016-02-17T14:02:51ZIvan Privalko
<p>Hi all,</p>
<p>I'm using the BHPS dataset (w1-w18) to look at the impact of job changes. Using the job history files, I code a variable for job changes (combining spell change "jspno" with reasons for change "jhstpy"). I'm tabulating this new variable by time to look at changes between waves, but I can't include the cross sectional weights (xrwght/xrwtuk1) in this table. After looking around for a bit I noticed that those with values in "job history files" contain no values for "cross sectional weight" variables, and those with "cross sectional weight values" have no values for "job history files" in any given year. Am I doing something wrong?</p> Understanding Society User Support - Support #503 (Closed): Inconsistencies in self-completion mo...https://iserredex.essex.ac.uk/support/issues/5032016-02-15T14:34:25ZTill Hoffmann
<p>In wave three of the Understanding Society survey, there are six entries for respondents who have refused the self-completion part of the interview but have positive self-completion interview weights. In particular, c_csac is 4 (refused) or 5 (not able to complete) (see <a class="external" href="https://www.understandingsociety.ac.uk/documentation/mainstage/dataset-documentation/wave/3/datafile/c_indresp/variable/c_scac">https://www.understandingsociety.ac.uk/documentation/mainstage/dataset-documentation/wave/3/datafile/c_indresp/variable/c_scac</a>) but c_indscub_xw is nonzero (see <a class="external" href="https://www.understandingsociety.ac.uk/documentation/mainstage/dataset-documentation/wave/3/datafile/c_indresp/variable/c_indscub_xw">https://www.understandingsociety.ac.uk/documentation/mainstage/dataset-documentation/wave/3/datafile/c_indresp/variable/c_indscub_xw</a>).</p>
<p>Similarly, there are three entries indicating that the self-completion part of the interview was refused but the response to the "What is the sex of your first friend?" question contains valid data even though the question is in the self-completion module.</p>
<p>Am I missing something?</p> Understanding Society User Support - Support #498 (Closed): weight youth self-completion + adult https://iserredex.essex.ac.uk/support/issues/4982016-02-04T10:02:59ZCarolina Zuccotticarolina.zuccotti@eui.eu
<p>Hello,<br />I would like to follow individuals (14-15 yrs) who completed the self-completion youth questionnaire into the adult questionnaire (16+). I am interested in the questions on parental involvement and how this affects their adult outcomes.<br />How should I weight this?<br />Let's say that I consider 14-15 yrs individuals in wave 1 and I follow them in wave 2 (and/or 3).<br />Many thanks,<br />Carolina</p> Understanding Society User Support - Support #462 (Closed): Longitudinal weightshttps://iserredex.essex.ac.uk/support/issues/4622015-12-08T10:14:12ZRory Coulterrcc46@cam.ac.uk
<p>Hi,</p>
<p>I'm conducting a longitudinal analysis of waves 1-5 using the GPS and EMB samples and have a quick question about the correct weight to use. Basically I'm interested in residential mobility between pairs of waves during the W1-5 period but I'm not 100% sure how to specify the correct longitudinal weights.</p>
<p>The User Guide says to take the longitudinal weight (w_indinus_lw in my case I think) from the last wave of data in use and apply this to all waves provided by respondents. However this assumes a balanced panel as only those people interviewed at every wave will end up with a longitudinal weight. Because the number of movers is relatively limited I'd like to also include cases provided by people until they stop responding for the first time. Is this possible and how should I create the weight? Can I for example just create a panel containing only person-years until people stop completing an interview for the 1st time and use their final longitudinal weight value in the analysis?</p>
<p>Thanks in advance. <br />Rory</p> Understanding Society User Support - Support #456 (Closed): comparing across waveshttps://iserredex.essex.ac.uk/support/issues/4562015-11-27T16:55:32ZCarolina Zuccotticarolina.zuccotti@eui.eu
<p>Hello,<br />I wanted to know if it is possible to compare the effect of a variable in wave 1 with its effect in wave 5.<br />For example, has education a stronger effect in the probabilities of employment in 2009/2010 than in 2013/14?<br />At the naked eye, there seems to be a difference in the effect across waves. However, do you know if there might be a way to actually test this?<br />I would need to pool waves I assume. In that case, how should I weight the cases?<br />Many thanks in advance.<br />Carolina</p> Understanding Society User Support - Support #453 (Closed): Zero value weight with nurse data com...https://iserredex.essex.ac.uk/support/issues/4532015-11-24T08:15:09ZGaelle Albertus
<p>Hi,</p>
<p>I am working on the nurse health assessment data combined to the blood sample data and the full main interview. I focus only on the wave 2.<br />My aim is to describe the biomarkers and use them to construct a score representative of health condition according to gender and age.</p>
My problem is about the weighting strategy. I already posted a message on the forum (<a class="issue tracker-3 status-5 priority-5 priority-high2 closed" title="Support: Sample size and weight within blood sample w2 (Closed)" href="https://iserredex.essex.ac.uk/support/issues/430">#430</a>) and now I consider the <strong>indbdub_xw</strong> weight variable.
<ul>
<li>Why is there people who have a weighting value of 0 ? I count 670 individuals in this case from n=9882. In addition, most of them do not have missing values on biomarkers so, why do we not use them ?</li>
<li>I have to delete individuals who have missing values (because I would use a PCA on my data). Is it still correct to use the indbdub_xw variable ?</li>
</ul>
<p>Thank you for your response.<br />Gaelle</p> Understanding Society User Support - Support #440 (Closed): Longitudinal Regression Analysis Weightshttps://iserredex.essex.ac.uk/support/issues/4402015-11-01T18:55:41ZEsther Afolalue.f.afolalu@warwick.ac.uk
<p>Hello. I am working on the understanding society database looking specifically at the self-completion questionnaire data for the sleep and health questions. I am carrying out a longitudinal regression analysis to explore the association between change in individual sleep status on the health outcomes from wave 1 – wave 4 controlling for a number of other variables. I just wanted to double-check which longitudinal weight I should apply to the regression analysis – I am thinking ‘d_indscus_lw’? And for descriptive statistics to describe the initial sample at wave one, would I just use the ‘a_indscus_xw’ weighting?</p>
<p>Also, if I wanted to incorporate nurse assessment CRP biomarker data at Wave 2 as a mediator or examine the association from Wave 1 sleep status to Wave 2 biomarker status, which weighting would I apply in this case 'b_indnsus_lw'? And lastly, is there a weighting that’s applicable perhaps to look at the association from Wave 2 biomarker status to Wave 4 sleep?</p>
<p>Thank you,<br />Esther.</p> Understanding Society User Support - Support #430 (Closed): Sample size and weight within blood s...https://iserredex.essex.ac.uk/support/issues/4302015-10-09T08:25:29ZGaelle Albertus
<p>Hi,</p>
<p>I am working on Nurse Health Assessment database in order to use data related to biomarkers and health. I decided to use the wave 2 from the survey combined with the main sample (wave 2 - GPS sample).</p>
<p>Therefore, I am contacting you because the user manual gives a number of subjects equals to 10,175 while when I import the database (with STATA), I only have 9,906 individuals for wave 2.<br />In addition, I took a look at the variable called "b_nuroutc" in the b_indresp file from Nurse Assessment and I obtained 9 957 individuals belonging to "nurse visit conducted - blood sample sent to lab".<br />How could you explain this difference ?</p>
<p>Next, I tought about using the cross-sectional weight for GPS sample only (b_indnsus_xd) but I am wondering if I could use it even if I don't have the correct number of subjects. Is this weight is still accurate givent that I miss approximately 300 individuals ?</p>
<p>Thank you for helping me.<br />Kind regards,</p>
<p>Gaelle</p> Understanding Society User Support - Support #412 (Closed): Weights for BHPS and Understanding So...https://iserredex.essex.ac.uk/support/issues/4122015-09-09T13:04:23ZAndreas Wiedemannawiedem@mit.edu
<p>Hello,</p>
<p>I’ve merged the BHPS with the BHPS-subset of Understanding Society to create a longitudinal panel of BHPS respondents up until 2012 (i.e. I use the BHPS portion of Understanding Society). I am not entirely sure which weights I should use for the analysis. I’ve read the documentation of both dataset, but it is still not clear which weights are the best for my purpose. My goal is to re-create the same underlying population in both datasets, either for the UK or GB. Most importantly, however, I want to be consistent across these two dataset in order to analyze trends in, e.g., income over a time span covering both datasets. Most of my variables of interest are at the household level, but some are at the individual level. <br />Should I use the longitudinal BHPS weights (indin91_lw for individuals or the cross-sectional hhdenbh_xw for households)? And do I have to use weights only in the Understanding Society-part or also in the BHPS part of my panel.</p>
<p>Many thanks for your help,<br />Andreas</p> Understanding Society User Support - Support #261 (Closed): Weighting for wave 2https://iserredex.essex.ac.uk/support/issues/2612014-05-09T12:29:48ZFatemeh Behzadnejadfatemeh.behzadnejad@mail.mcgill.ca
<p>Dear support team,</p>
<p>I am using the wave 2 of US survey in a cross-sectional analysis. I need to account for all respondents including those from BHPS, but I am not quite sure about the weighting.<br />According to page 38 of the user guide, there are two weights for this wave: indinus_xw and indinbh_xw. Can I combine these two weights? (I mean for each respondent I consider either of these two wights which is available)</p>
<p>I appreciate it if you could answer my question.</p> Understanding Society User Support - Support #245 (Closed): cross sectional hh weights in US w1/2/3https://iserredex.essex.ac.uk/support/issues/2452014-02-26T13:37:33ZIan Alcockian.alcock01@btinternet.com
<p>I am confused by the differences in the cross-sectional household weights available in the US a_hhresp b_hhresp and c_hhresp files. My understanding is this: in a_hhresp is a_hhdenus_xw which weights the households originating with Understanding Society (which comprise all households in this wave); in b_hhresp are b_hhdenbh_xw which weights the households originating with BHPS (and is set to 0 for households originating with Understanding Society) and b_hhdenus_xw which weights the households originating with Understanding Society (and is set to 0 for households originating with BHPS); in c_hhresp is c_hhdenub_xw which weights all households together, i.e. weights across households originating with BHPS and US. My questions: 1) Is my understanding correct? 2) If my understanding is correct, how do I weight all households in b_hhresp together (as I can do for households in c_hhresp), and how do I weight only the households originating in BHPS in c_hhresp (as I can do for households in b_hhresp). I want to do both of these things; I want to produce weighted quintiles of income in the previous month for the bhps originating households (so that the weighting increases their UK representativeness) in both b_hhresp and c_hhresp, and I want to produce weighted quintiles of income in the previous month for all available households (so that the weighting increases their UK representativeness) in both b_hhresp and c_hhresp, but I appear to be able to do only the former in b_hhresp and only the latter in c_hhresp. 3) What accounts for the difference in the cross-sectional household weights available in b_ and c_ ? Big Thank you in advance!</p>