Understanding Society User Support: Issueshttps://iserredex.essex.ac.uk/support/https://iserredex.essex.ac.uk/support/support/favicon.ico?15995719382017-12-04T17:45:15ZUnderstanding Society User Support
Redmine Support #886 (Closed): Zero weights and statistical powerhttps://iserredex.essex.ac.uk/support/issues/8862017-12-04T17:45:15ZEric Emersoneric.emerson@lancaster.ac.uk
<p>Hi</p>
<p>I'm interested in data contained the harassment modules (in Waves 1, 3, 5 and 7), but am concerned about the significant reduction in statistical power arising from the increasing proportion of respondents who are assigned values of 0 in w_ind5mus_xw. I understand from a previous thread (<a class="issue tracker-3 status-3 priority-5 priority-high2" title="Support: weights for pooled cross-sections over waves (a)-(f) (Resolved)" href="https://iserredex.essex.ac.uk/support/issues/877">#877</a>) that ..... 'The provision of weights requires the ability to estimate probabilities of continuing to respond over multiple waves. This is true of cross-sectional weights as well as longitudinal ones, as they are derived from the longitudinal ones (how this was done is described in section 3.8.3.10 of the User Guide). In consequence, a person in a household where there is no person who has been enumerated at every wave up to wave w will get a weight of zero. Such people should not be given a weight, as the weights for all other sample members are calculated in a way that compensates for these "missing" people.'</p>
<p>However, the 'compensation' appears to also result in a significant loss of statistical power. Taking as base the unweighted number of respondents who provide a valid answer to the 'attacked' items, the weighted population size has reduced from 92% of actual respondents in W1 (7418/8072) to just 27% in W7 (2711/9973). The resulting reduction in power is of concern and given the rationale outlined above, will continue to increase over time as the % of households in which someone has been enumerated at every wave will continue to diminish. It also seems rather wasteful of people's time that the responses of the majority of participants is, through the weighting process, assigned to a statistical waste bin!</p>
<p>Be very grateful if you could suggest any ways round this problem.</p>
<p>Many thanks</p>
<p>Eric</p> Support #881 (Closed): weighting values of zerohttps://iserredex.essex.ac.uk/support/issues/8812017-11-21T11:51:17ZAndrew Brown
<p>Hi</p>
<p>Could I ask a question related to issue <a class="issue tracker-3 status-3 priority-5 priority-high2" title="Support: weights for pooled cross-sections over waves (a)-(f) (Resolved)" href="https://iserredex.essex.ac.uk/support/issues/877">#877</a>?</p>
<p>For those cases not assigned a weight because 'a person in a household where there is no person who has been enumerated at every wave up to wave w will get a weight of zero. Such people should not be given a weight, as the weights for all other sample members are calculated in a way that compensates for these "missing" people'</p>
<p>How could/ should they be included in any analysis as SPSS 'makes them invisible' - could they be assigned the mean weight of 1? or should they be excluded from any analysis as the weighting for the other cases takes account of this?</p>
<p>Many thanks</p>
<p>Andrew</p> Support #878 (Closed): employment historieshttps://iserredex.essex.ac.uk/support/issues/8782017-11-10T17:01:37Zmarco tosiM.Tosi@lse.ac.uk
<p>Hello,<br />I try to reconstruct employment histories in wave 1 to analyse implications on health in the later waves. Since employment histories are collected for three quarters of the sample in wave 5, I am using both a_empstat (for the first quarter of the sample) and e_empstat (for the remaining sample). Some employment histories of the second group (which should be present in e_empstat) are missed because some people drop out between wave 1 and 5. Am I correct? If yes, would be better to use only the first quarter of the sample (which is randomly selected)?</p> Support #872 (Closed): Weights of the continuation of the BHPS sample into UKHLShttps://iserredex.essex.ac.uk/support/issues/8722017-10-26T13:23:54ZLawrence Saccolawrence.sacco@kcl.ac.uk
<p>To whom it may concern,</p>
<p>I am using the BHPS sample, from its first wave (1991) through to the most recent wave of the UKHLS (selecting only BHPS members). I read about the BHPS and UKHLS weights in the documentation and training courses, but I still was not sure which weight I should use when restricting the analysis to only the BHPS using also its continuation. Furthermore, I could not find any other previous studies that have used the continuation of the BHPS sample into the UKHLS applying weights for the estimates. Therefore I was wondering whether there is a recommended procedure to ensure the analysis is representative of the UK or Great Britain, when following the BHPS sample into the UKHLS.</p>
<p>Thanks,<br />Lawrence</p> Support #848 (Closed): Clinical Depression H_COND variableshttps://iserredex.essex.ac.uk/support/issues/8482017-09-04T15:55:51ZLuca Bernardiluca.bernardi@uab.cat
<p>Dear Support group,</p>
<p>I am measuring clinical depression and I would kindly need your advice on a couple of questions. I apologise sincerely for putting immediate priority on this, but your answer might also have implications for a paper I am co-authoring within the Understanding Society EU Referendum project and we have a deadline shortly for submitting the paper.</p>
<p>As I am interested in objective depression, I was using the questions H_COND17 and H_CONDS17 to create a measure of depression. What I was doing is to assign value 1 to respondents who replied that they still have depression in H_CONDS17=Yes (as I am interested in the effects of depression, I do not care much if the person was diagnosed with depression at some point in his/her life - i.e. H_COND17=Yes - but rather it is important that the person is depressed at the time of the interview). I assign value 0 if the respondent mentioned that he/she has never been diagnosed with depression in H_COND17=No.</p>
<p>So far I was using data from waves 1, and 3 to 6 as I noticed that these two variables are available in all waves but wave 2 (<a class="external" href="https://www.understandingsociety.ac.uk/documentation/mainstage/dataset-documentation/wave/2/datafile/b_indresp">https://www.understandingsociety.ac.uk/documentation/mainstage/dataset-documentation/wave/2/datafile/b_indresp</a>), where instead a slightly different question is asked: H_CONDN17. In turn, this question is not available in all waves and sometimes is asked together with the previous two questions (e.g., <a class="external" href="https://www.understandingsociety.ac.uk/documentation/mainstage/dataset-documentation/wave/4/datafile/d_indresp">https://www.understandingsociety.ac.uk/documentation/mainstage/dataset-documentation/wave/4/datafile/d_indresp</a>).</p>
<p>My questions thus are the following. Do you please know what is the reason of such a variation and, more importantly, can I "maximise" my number of depressives by creating a measure of depression that combines both sets of questions (i.e., H_COND17 and H_CONDS17, and H_CONDN17) and makes use of all available waves (i.e. 1 to 6)?</p>
<p>My idea was to do the following:</p>
<p>gen depression = .</p>
<p>replace depression = 1 if hconds17==1 | hcondn17==1</p>
<p>replace depression = 0 if hcond17==0 | hcondn17==0</p>
<p>However, I wonder how problematic can be mixing questions that are not available in all waves, as this is certainly a point that reviewers will raise. I would really appreciate your thoughts on this.</p>
<p>Many thanks and best wishes,<br />Luca</p> Support #847 (Closed): Complex samples suite in SPSShttps://iserredex.essex.ac.uk/support/issues/8472017-09-04T15:52:54ZCarla Ayrton
<p>Hello,</p>
<p>I am carrying out an analysis of people who's income changed between wave 5 and 6. I have followed the 'Introdution to Understanding Society using SPSS' course but I am not sure how to weight the data for my analysis. I am using the weight f_indpxus_lw but and I have used f_strata and f_psu to account for the complex survey desgin. However in the SPSS complex samples suite I cannot find a way to general medians or quartiles. Can I calculate these using just the longitudinal weight (f_indpxus_lw)? (</p>
<p>Thank you,</p>
<p>Carla Ayrton</p> Support #839 (Closed): Pooling data from all waves, 1-6, using all subsamples of USoc (GPS, EMBS,...https://iserredex.essex.ac.uk/support/issues/8392017-08-21T10:54:34ZNico Ochmannnico.ochmann@postgrad.manchester.ac.uk
<p>Dear Peter,</p>
<p>sorry to bother you again, but we had that exchange in support <a class="issue tracker-3 status-5 priority-4 priority-default closed" title="Support: weights for pooled cross-sections over waves (a)-(f) (Closed)" href="https://iserredex.essex.ac.uk/support/issues/758">#758</a>. <br />I have been implementing your suggestions as we discussed. I am contacting you now again because I would really appreciate you <br />having a quick look again at our previous exchange and this current issue. With the weighting scheme, my results change quite a bit (point estimates and standard errors), so I really want to make sure that I am doing things right. I do think I do, but I rather double check. <br />So I am using all subsamples of all six waves and I generate a new weighting variable accordingly (newwgt). I use these observations across waves as if they were repeated cross sections. For various reasons, I cannot utilize the panel data structure. The time dimension of the pooled cross sections is not the wave variable, but the year variable, istrtdaty, start of the individual interview. Given this information and the discussion we had in support <a class="issue tracker-3 status-5 priority-4 priority-default closed" title="Support: weights for pooled cross-sections over waves (a)-(f) (Closed)" href="https://iserredex.essex.ac.uk/support/issues/758">#758</a>, I would highly appreciate your verification of me doing things correctly. <br />Thank you very much in advance! <br />Nico</p> Support #836 (Closed): Weighting for general populationhttps://iserredex.essex.ac.uk/support/issues/8362017-08-15T15:31:19ZRay Storryraystorry@incomesdataresearch.co.uk
<p>I have run a query on persons employed who expect to move in the next year Yes/No. I have weighted the data in Wave 6 with f_indpxui_xw. The total population I am getting along with don't knows etc. is 23557.862. But this looks to me like the total number of employed respondents in the sample. There are around 31.4 millions employed persons in the UK.</p>
<p>So how do get from the weighted figures shown above to come anywhere near the real world figures?</p> Support #825 (Closed): BHPS weights for BHPS components in UKHLS wave 6https://iserredex.essex.ac.uk/support/issues/8252017-07-28T08:40:20ZSook Kimcps500@gmail.com
<p>Our team is analysing gender pay gap using BHPS samples. We require two separate weights, A and B, and we exclude Northern Ireland and IEMBS samples.</p>
<p>A: We'd like to use UKHLS wave 6, adult cross-sectional data, BHPS sample only. I don't seem to find a weighting variable for BHPS sub-sample. Could you advise me? Currently, we use "indpxui_xw" but I'm wondering if there's a weight for BHPS sample only? (In the wage regression, we're entering work-life history beginning from 1991 to 2015, in order to analyse the long-term impact of work-life history)</p>
<p>B: We'd like to use UKHLS wave 6, cross-sectional data, UKHLS with BHPS sub-sample.(In the wage regression, we're entering work-life history beginning from 2010 to 2015). Again, we currently specified "indpxui_xw" as weights but this includes IEMBS.</p>
<p>Please let me know if this is clear and any advice would be greatly appreciated.</p> Support #823 (Closed): BHPS Weightshttps://iserredex.essex.ac.uk/support/issues/8232017-07-26T15:44:58ZLiviu Nafornita
<p>Ispoke briefly on the phone to one of your IESR colleagues about getting some information about weights in the BHPS. I was then directed to this page.</p>
<p>The issue I am having, is that having inspected the longitudinal weights (wLRWGHT), I am a little confused over how to use it properly!</p>
<p>So I notice in volume A of the documentation, it states the means of these weights as having a value of 1.00 – except for wave R, where the individual respondent longitudinal weight has a value of 2. When I tried to replicate these means, I found that the weights had a mean value of ~0.85 at wave B, down to ~0.35 at wave R – however when excluding values with a 0 weight, the mean for the weights aligned themselves to the value stated in Volume A – So, 1.00.</p>
<p>The documentation also states that “The longitudinal respondent weights (wLRWGHT) selects out cases who gave a full interview at all waves in the BHPS files.” And then “At each wave these cases are re-weighted to take account of previous wave respondents lost through refusal at the current wave or through some other form of sample attrition.”</p>
<p>The issue the weight values pose is that when analysing later waves, e.g. merging wave Q to wave R (and only merging those who have responded in both waves), and doing some longitudinal analysis, because the weight at wave R is close to 0.35, we only get roughly 35% of the sample as usable. This seemed strange to me and hence wanted to know why this was the case.</p>
<p>I mean I understand if one is engaging in longitudinal analysis from Wave A to Wave R, where you indeed might expect 35% of original respondents at Wave A, to still be on the survey at Wave R. However when doing two wave analysis and merging just those who have stayed on in both waves, I don’t see why my sample is still being restricted.</p>
<p>Hope my confusion is clear enough, but don’t hesitate to contact me for clarification.</p> Support #820 (Closed): Longitudinal vs cross-sectional weight choicehttps://iserredex.essex.ac.uk/support/issues/8202017-07-25T09:01:19ZOliver Southwickoliver.southwick@oliverwyman.com
<p>I am performing a regression analysis, assessing which characteristics in one wave influence outcomes in the next</p>
<p>Is the appropriate choice of weighting for this cross-sectional or longitudinal? The majority of variables fall from within a single wave but the outcome in the next wave would suggest use of a longitudinal weight?</p>
<p>Thanks very much,</p>
<p>Olly</p> Support #814 (Closed): Longitudinal weight in R with - wide datsethttps://iserredex.essex.ac.uk/support/issues/8142017-07-18T09:44:59ZEleonora Iob eleonora.iob@icloud.com
<p>Dear UKHLS User Support team,</p>
<p>I'm conducting a latent growth curve analysis using waves 2-6 in R. The dataset is in the wide format. I was wondering whether I can use the longitudinal weight provided in the last wave as my weighting variable or if any further manipulation is needed.</p>
<p>Kind regards, <br />Eleonora Iob</p> Support #785 (Closed): Weighting for a complex sub-merged datasethttps://iserredex.essex.ac.uk/support/issues/7852017-05-16T10:29:00ZEmily Lowthianlowthianem@Cardiff.ac.uk
<p>Hi there,</p>
<p>I have read the user guide 1-6 section on weighting (pages 56 - 83), however I am still slightly lost in what weights I can apply to this dataset.</p>
<p>I hope to do a fixed/random effects longitudinal model to understand how traumatic events, i.e. parental fighting, physical and verbal abuse towards children, reported by the parents (in both the hh and individual questionnaire) effect young people's physical and mental well-being (using the youth completion questionnaires); parents answers will be sub-merged on to the young peoples cases. After I have understood this relationship - I intend to understand how measures of affluence - material deprivation, socioeconomic group, income etc - moderate this relationship.</p>
<p>I'm slightly confused what weighting to use as I will be using both hh, individual and youth data and I will be using the data from Wave 1, 2, 4 and 6 (data available).</p>
<p>At the moment I would use in your conventional format w_xxxyyzz_aa = 1_ythscus_lw - however there is no longitudinal weights for youth data.</p>
<p>Please could you guide me in what the appropiate decision would be to take on this matter?</p> Support #758 (Closed): weights for pooled cross-sections over waves (a)-(f) https://iserredex.essex.ac.uk/support/issues/7582017-03-29T17:32:21ZNico Ochmannnico.ochmann@postgrad.manchester.ac.uk
<p>Hi there,</p>
<p>I am running hourly wage (constructed with w_paygu_dv) on a number of regressors in a pooled cross-section over all six waves. So far, I am using the whole sample based on GPS, EMBS, BHPS, IEMBS. I am not sure what kind of weights to use in this context given that I want to use all four samples. f_indinui_xw is available for all four for wave 6, so do I just go ahead and use that one? <br />Any piece of advice would be terrific. <br />Thanks a lot!</p> Support #752 (Closed): Weightshttps://iserredex.essex.ac.uk/support/issues/7522017-03-21T12:47:05ZNicola Spencer Godfreyn.spencergodfrey@surrey.ac.uk
<p>Hello,</p>
<p>I notice from a previous posting on the User Support site (<a class="issue tracker-3 status-5 priority-4 priority-default closed" title="Support: Cross-sectional weight for self-completion questionnaire in wave 6 (Closed)" href="https://iserredex.essex.ac.uk/support/issues/736">#736</a>), that the weight f_indscub_xw for the adult self-completion questionnaire excluding the IEMB, has now been calculated for Wave 6. Would it be at all possible to provide me with this weight please, for SPSS?</p>
<p>Thank you very much.<br />Nicola Spencer Godfrey</p>