Understanding Society User Support: Issueshttps://iserredex.essex.ac.uk/support/https://iserredex.essex.ac.uk/support/support/favicon.ico?15995719382023-12-14T15:03:18ZUnderstanding Society User Support
Redmine Understanding Society User Support - Support #2012 (Resolved): longitudinal weighthttps://iserredex.essex.ac.uk/support/issues/20122023-12-14T15:03:18ZMargherita Agnoletto
<p>Dear Understanding Society Team,</p>
<p>I am currently examining the relationship between flexible work arrangements (FWA) and some employees' outcomes.</p>
<p>Given that questions about FWA are asked every two waves, I have chosen to conduct a longitudinal analysis (FE) using waves 2, 4, 6, 8, and 10. Some of my outcomes come from the self-completion questionnaire. <br />As I understand, it is recommended to use the appropriate longitudinal weight from the last wave in my analysis (i.e. i_indinus_lw). However, I observe a significant loss of observations. <br />Given that my panel is unbalanced, could I use the corresponding longitudinal weight from the last available wave for each individual? For instance, if an individual 'i' has information until wave 8, I propose imputing the appropriate longitudinal weight from wave 8. Similarly, if individual 'k' has information until wave 6, I suggest imputing the weight from wave 6.</p>
<p>Thank you for your attention.</p>
<p>Kind regards</p> Understanding Society User Support - Support #1982 (Resolved): reference person weights https://iserredex.essex.ac.uk/support/issues/19822023-10-12T11:19:12ZAmelia Wattsamelia.watts678@outlook.com
<p>Dear Olena/support team,</p>
<p>I'm selecting reference persons from households across waves to form a panel. Can the individual longitudinal weights for these respondents in the last wave be used as suboptimal weights in the analysis?</p>
<p>Many thanks, <br />Amelia</p> Understanding Society User Support - Support #1902 (Resolved): weights individual files waves 10 ...https://iserredex.essex.ac.uk/support/issues/19022023-05-15T13:20:37ZAelen Valen
<p>Hi,</p>
<p>I am trying to merge individual files across waves 10 and 11 into wide format to create a 2019 calendar year dataset.<br />I used this method from "Box 1: Example syntax for pooled analysis for cross-sectional estimation relating <br />to calendar year 2011, with weight re-scaling" in <a class="external" href="https://www.understandingsociety.ac.uk/sites/default/files/downloads/documentation/user-guides/mainstage/weighting_faqs.pdf">https://www.understandingsociety.ac.uk/sites/default/files/downloads/documentation/user-guides/mainstage/weighting_faqs.pdf</a></p>
<p>ge wts=0 <br />replace wts=indpxui_xw if month>=13 & month<=24 <br />ge ind=1 <br />sum ind [aw=indpxui_xw] if month>=1 & month<=12 <br />gen jwtdtot=r(sum_w) <br />sum ind [aw=indpxui_xw] if month>=1 & month<=12 <br />gen kwtdtot=r(sum_w) <br />replace wts=indpxui_xw*(jwtdtot/kwtdtot) if month>=1 & month<=12</p>
<p>For the purpose of the research I am working on, I am using the equivalised household income and other variables referring to parental occupation, education and place of birth.</p>
<p>Since I am using it together with EUSILC 2019 for different EU countries, I was comparing the weights with the weights in EUSILC. While the sum of the weights in the latter equals on average the 80% of the real population in each country, the sum of weights of the dataset I created for UK 2019 (with the merge of wave 10 and 11) gives a number way lower than the census 2019 UK population.</p>
<p>Could you please help me understanding how those weights are constructed, which characteristics of the population they consider, whether they can comparable to ones in EUSILC and whether the procedure I followed to merge the two waves is correct. <br />Many thanks in advance for the support!</p> Understanding Society User Support - Support #1859 (Resolved): sample size loss due to weighting https://iserredex.essex.ac.uk/support/issues/18592023-02-20T14:26:07ZCaroline Kienast von Einem
<p>Hi,</p>
<p>I am aware that weighting will affect and alter the sample size of the analysis, however, I am working with a pooled sample of participants from wave 3-6 and when I specify a weighted model my sample drops from ~45k to 27k. This seems quite significant, particularly once I start to investigate subgroup characterstics..</p>
<p>Would you be able to confirm whether a drop by ~20k is normal once weighting is applied ( ai am using the longitudinal wave f weight "f_indinub_lw" / whether the below STATA code makes you think it is instead an error with my coding?</p>
<p>STATA CODE:</p>
<p>//Open wave 6:<br />use f_hidp f_psu f_strata pidp f_sex_dv f_age_dv f_indinub_lw using "$inpath\f_indresp", clear</p>
<p>save "test", replace</p>
<p>foreach w in c d e {</p>
<pre><code>// Extract the variables needed<br /> use "$inpath/`w'_indresp", clear<br /> isvar pidp `w'_addrmov_dv `w'_adcts `w'_distmov_dv `w'_mvyr `w'_mvever `w'_plnowy4 <br /> keep `r(varlist)'</code></pre>
<pre><code>// save each wave specific file<br /> save `w'junk.dta, replace<br />}</code></pre>
<p>// Open the file for wave f and then add the rest of the wave specific files<br />use "test", clear<br />foreach w in c d e {<br /> merge 1:1 pidp using `w'junk.dta<br /> drop _merge<br /> }</p>
<p>save "test", replace</p>
<p>// get rid of unwanted temporary files<br />foreach w in c d e {<br /> erase `w'junk.dta<br />}</p>
<p>mvdecode _all, mv(-9/-1)</p>
<p>//I only want those with data at wave 6 <br />drop if f_hidp==.</p>
<p>tabulate f_sex_dv // -> n= 45,186</p>
<p>svyset f_psu , strata(f_strata) singleunit(scaled)|| pidp, weight(f_indinub_lw)<br />svy: tabulate f_sex_dv, count col // -> n=27,094</p> Understanding Society User Support - Support #1696 (Resolved): random effects logistic regression...https://iserredex.essex.ac.uk/support/issues/16962022-05-09T12:57:51ZZohra Ansari-Thomas
<p>Hello,</p>
<p>I am attempting to run a random effects logistic regression model using waves 1-10 of the UKHLS, and am running into some issues with how to take into account the longitudinal weighting, strata, psu, as well as clustering by PIDP or allowing for random intercepts by PIDP to account for the longitudinal design of the study. I am using Stata</p>
<p>I can svy set my data to account for the longitudinal weights (indinus_lw), the psu, and the strata, but I am not sure how to account for the clustering by PIDP. I am using the svy: melogit command.</p>
<p>Any advice would be much appreciated, thank you!</p> Understanding Society User Support - Support #1100 (Resolved): weights and 'svy set' using wave 2...https://iserredex.essex.ac.uk/support/issues/11002018-11-26T10:33:04ZPer-Ola Sundinperola.sundin@regionorebrolan.se
<p>I am evalutaing associations between kidney function (serum creatinine) determined in blood samples at the wave 2/3 nurse health assesment and all cause mortality up to wave 7. I also use self reported diagnoses from the main stage interview wave 2/3 for adjustment in the models.</p>
<p>I have found what I believe is the correct weight to use (indbdub_xw from xlabblood_ns.dta).</p>
<p>I do not use data from the following waves although I use the flag for diseased individuals provided (which is based on information from the following waves).</p>
<p>Besides being ceratin that I use the most appropiate weights I am also a bit confused about which variables that I should use when I 'svy set' my data. In xlabblood_ns.dta were I find the weight variable I do not find any 'psu' or 'strata' variables.</p>
<p>Yours sincerely<br />Per-Ola Sundin PhD-student</p> Understanding Society User Support - Support #886 (Closed): Zero weights and statistical powerhttps://iserredex.essex.ac.uk/support/issues/8862017-12-04T17:45:15ZEric Emersoneric.emerson@lancaster.ac.uk
<p>Hi</p>
<p>I'm interested in data contained the harassment modules (in Waves 1, 3, 5 and 7), but am concerned about the significant reduction in statistical power arising from the increasing proportion of respondents who are assigned values of 0 in w_ind5mus_xw. I understand from a previous thread (<a class="issue tracker-3 status-3 priority-5 priority-high2" title="Support: weights for pooled cross-sections over waves (a)-(f) (Resolved)" href="https://iserredex.essex.ac.uk/support/issues/877">#877</a>) that ..... 'The provision of weights requires the ability to estimate probabilities of continuing to respond over multiple waves. This is true of cross-sectional weights as well as longitudinal ones, as they are derived from the longitudinal ones (how this was done is described in section 3.8.3.10 of the User Guide). In consequence, a person in a household where there is no person who has been enumerated at every wave up to wave w will get a weight of zero. Such people should not be given a weight, as the weights for all other sample members are calculated in a way that compensates for these "missing" people.'</p>
<p>However, the 'compensation' appears to also result in a significant loss of statistical power. Taking as base the unweighted number of respondents who provide a valid answer to the 'attacked' items, the weighted population size has reduced from 92% of actual respondents in W1 (7418/8072) to just 27% in W7 (2711/9973). The resulting reduction in power is of concern and given the rationale outlined above, will continue to increase over time as the % of households in which someone has been enumerated at every wave will continue to diminish. It also seems rather wasteful of people's time that the responses of the majority of participants is, through the weighting process, assigned to a statistical waste bin!</p>
<p>Be very grateful if you could suggest any ways round this problem.</p>
<p>Many thanks</p>
<p>Eric</p> Understanding Society User Support - Support #513 (Closed): e_indscus_lwhttps://iserredex.essex.ac.uk/support/issues/5132016-02-26T12:35:08ZOrla McBride
<p>Hello,</p>
<p>I'm trying to use this longitudinal weight variable for analyzing data across waves 2-5 from the self-completion questionnaire. I'm trying to apply the weight variable in my analysis but given that ~21,000 cases have been assigned a value of 0, this means that the weight is viewed as missing in software such as Mplus.</p>
<p>Any recommendations for how to get around this? I was thinking that it might be possible to make the 0 values non-zero (e.g. 0.0000000000000001) and was wondering what your thoughts would be on this? Any other suggestions would be welcomed.</p>
<p>Kind regards<br />Orla</p> Understanding Society User Support - Support #506 (Closed): weights and design variables query Wa...https://iserredex.essex.ac.uk/support/issues/5062016-02-19T11:40:25ZOrla McBride
<p>Hello,</p>
<p>I was wondering if you could answer a question I have about weights and psu/strata variables for analysing Understanding Society data.</p>
<p>If I want to analyse data from the BPHS, GPS, and EMB samples from Wave 2-5, which comes from both the self-completion questionnaire and the main survey, should I use:</p>
<p>e_indscus_lw: Longitudinal adult self-completion questionnaire weight<br />e_strata: Sampling strata<br />e_psu: Primary sampling unit</p>
<p>Many thanks for your time.</p>
<p>Kind regards<br />Orla</p> Understanding Society User Support - Support #498 (Closed): weight youth self-completion + adult https://iserredex.essex.ac.uk/support/issues/4982016-02-04T10:02:59ZCarolina Zuccotticarolina.zuccotti@eui.eu
<p>Hello,<br />I would like to follow individuals (14-15 yrs) who completed the self-completion youth questionnaire into the adult questionnaire (16+). I am interested in the questions on parental involvement and how this affects their adult outcomes.<br />How should I weight this?<br />Let's say that I consider 14-15 yrs individuals in wave 1 and I follow them in wave 2 (and/or 3).<br />Many thanks,<br />Carolina</p> Understanding Society User Support - Support #456 (Closed): comparing across waveshttps://iserredex.essex.ac.uk/support/issues/4562015-11-27T16:55:32ZCarolina Zuccotticarolina.zuccotti@eui.eu
<p>Hello,<br />I wanted to know if it is possible to compare the effect of a variable in wave 1 with its effect in wave 5.<br />For example, has education a stronger effect in the probabilities of employment in 2009/2010 than in 2013/14?<br />At the naked eye, there seems to be a difference in the effect across waves. However, do you know if there might be a way to actually test this?<br />I would need to pool waves I assume. In that case, how should I weight the cases?<br />Many thanks in advance.<br />Carolina</p> Understanding Society User Support - Support #448 (Closed): weightshttps://iserredex.essex.ac.uk/support/issues/4482015-11-16T00:17:28ZVernon Hedgevernonhedge@hotmail.co.uk
<p>I am looking at data exclusively at Wave C Understanding Society c_indresp.sav. I am planning to employ model based inference which may (as needs be) incorporate weight, strata and PSU into the model.
<p>I am having difficulty finding out how the weights were computed. I was hoping to use include the variables by which the weights were calculated within the model and specify PSU as level 2 random effects. I just cannot seem to find how the weights were calculated from Understanding Society documentation.</p>
</p>
<p>All the variables are from the c_indresp file. The 12 are listed here as name, “label”, [position number in variable view of c_indresp.sav]</p>
<p>c_sex_cr “sex (corrected)” [2292],<br />c_age_cr “age (corrected)” [2294],<br />c_birthy “year of birth” [2771], <br />c_big5c_dv “Conscientiousness” [2896],<br />c_big5o_dv “Openness” [2899],<br />c_hiqual_dv “Highest qualification” [2904], <br />c_gwri_dv “Cognitive ability: Immediate word recall: Number of correct items” [2915], <br />c_cgvfc_dv “Cognitive ability: Verbal fluency: Count of correct answers” [2932],<br />c_cgna_dv “Cognitive ability: Numeric ability: Count of items answered correctly”[2935], <br />c_jbnssec8_dv “Current job: Eight Class NS-SEC” [2947],</p>
<p>I am also having difficulty identifying which weight variable would be most appropriate to my analysis according to the w_xxxyyzz_aa scheme (p67 of the User Manual).</p>
<p>I can fill in this much c_indyyzz_xw – i.e., I know I am dealing with wave c only (so c_ and xw) and only with adult (16+) respondents (so ind).</p>
<p>I have identified 4 weight variables relevant to a cross-sectional design in the c_indresp file,</p>
<p>1. c_indpxub_xw “combined cross-sectional adult main or proxy interview weight” [3002], <br />2. c_indinub_xw “combined cross-sectional adult main interview weight” [3003], <br />3. c_indscub_xw “combined cross-sectional adult self-completion interview weight” [3004], <br />4. c_ind5mus_xw “cross-sectional extra 5 minute interview person weight” [3005].</p>
<p>The yy component must be either px, in, sc, or 5m. I think I can exclude 5m, as none of the variables on my list is on the list on Table 25 (p56) of the User Manual. Likewise, viewing Table 24 (p53), I think sc can be excluded.</p>
<p>As for the zz component it is tempting to just use "us" (for “understanding society”?). The user guide advises me that the "us" designation refers to “GPS [General Population Sample] and EMB samples” – is this what is meant by “Mainstage”?</p>
<p>Looking at the “Levels of Analysis” in Table 28 (p62), I think I can exclude level 4 “Adult or youth self-completion”. I cannot, however seem to find information on whether the c_indresp variables I am using are level 3 “Adult proxy and main interview” or level 2 “Adult main interview only (no proxy)”. Using the Understanding Society website to search each variable name they all return “Mainstage Variable”. I cannot tell from this which level of 1 to 4 is the most appropriate to select a weighting variable.</p>
<p>So the two problems I have are 1) identifying which variables were used to calculate survey weights and 2) identifying which ”xw“ survey weight variable is most appropriate to my analysis.</p>
<p>I would be enormously grateful for any clarification.</p> Understanding Society User Support - Support #437 (Closed): sample design and community establish...https://iserredex.essex.ac.uk/support/issues/4372015-10-28T10:36:42ZPhil Jonesphil.jones@sheffield.ac.uk
<p>Hello,</p>
<p>I'm trying to establish if residents in communal accommodation are sampled in US. I've read the user guide and technical notes which refer to samples being taken of 'residential' addresses using the PAF. Do such residential addresses include residents in communal establishments like nursing homes, student halls of residence, etc.?</p>
<p>I notice that communal establishments/institutions are considered ineligible from the technical notes. In this case what constitutes a communal establishment? Does this mean residents in nursing homes etc are ineligible, or does such an establishment refer to non-residential establishments, e.g. guest houses, hotels, etc.?</p>
<p>Sorry if my question appears confused. I'm simply trying to find out exactly what is included in a 'residential' address so I can match US to appropriate census records which includes residents of nursing homes, student halls, prisons etc. I suspect it refers only to private households but just want to confirm.</p>
<p>Many thanks</p> Understanding Society User Support - Support #253 (Closed): general population samplehttps://iserredex.essex.ac.uk/support/issues/2532014-03-28T16:09:03Zpeter tammes
<p>Dear sir /madam,<br />We would like to use only the General Population Comparison sample. Which of the weight variables should we use in our analysis?<br />Thank you <br />Peter</p> Understanding Society User Support - Support #245 (Closed): cross sectional hh weights in US w1/2/3https://iserredex.essex.ac.uk/support/issues/2452014-02-26T13:37:33ZIan Alcockian.alcock01@btinternet.com
<p>I am confused by the differences in the cross-sectional household weights available in the US a_hhresp b_hhresp and c_hhresp files. My understanding is this: in a_hhresp is a_hhdenus_xw which weights the households originating with Understanding Society (which comprise all households in this wave); in b_hhresp are b_hhdenbh_xw which weights the households originating with BHPS (and is set to 0 for households originating with Understanding Society) and b_hhdenus_xw which weights the households originating with Understanding Society (and is set to 0 for households originating with BHPS); in c_hhresp is c_hhdenub_xw which weights all households together, i.e. weights across households originating with BHPS and US. My questions: 1) Is my understanding correct? 2) If my understanding is correct, how do I weight all households in b_hhresp together (as I can do for households in c_hhresp), and how do I weight only the households originating in BHPS in c_hhresp (as I can do for households in b_hhresp). I want to do both of these things; I want to produce weighted quintiles of income in the previous month for the bhps originating households (so that the weighting increases their UK representativeness) in both b_hhresp and c_hhresp, and I want to produce weighted quintiles of income in the previous month for all available households (so that the weighting increases their UK representativeness) in both b_hhresp and c_hhresp, but I appear to be able to do only the former in b_hhresp and only the latter in c_hhresp. 3) What accounts for the difference in the cross-sectional household weights available in b_ and c_ ? Big Thank you in advance!</p>