Understanding Society User Support: Issueshttps://iserredex.essex.ac.uk/support/https://iserredex.essex.ac.uk/support/support/favicon.ico?15995719382024-03-13T15:36:15ZUnderstanding Society User Support
Redmine Understanding Society User Support - Support #2075 (Feedback): Using UKHLS to look at trends acro...https://iserredex.essex.ac.uk/support/issues/20752024-03-13T15:36:15ZJames Laurence
<p>Hi there,</p>
<p>I am interested in looking at calendar month trends in whether someone wants to move home or not (which is available in every wave): lkmove. Ideally, I would like to look at trends using all waves (1-13). However, if it is easier to look at trends from some other start point, e.g.. 2016 or 2017, then I am flexible. I am also flexible as to whether the BHPS sample is included or not. This will be cross-sectional analysis, so I hope to treat each calendar month as a cross-section (I won’t be doing any longitudinal analysis).</p>
<p>I have been reading the helpful notes on ‘Running analysis on a calendar year or month’ (<a class="external" href="https://www.understandingsociety.ac.uk/documentation/mainstage/user-guides/main-survey-user-guide/how-to-use-weights-analysis-guidance-for-weights-psu-strata/">https://www.understandingsociety.ac.uk/documentation/mainstage/user-guides/main-survey-user-guide/how-to-use-weights-analysis-guidance-for-weights-psu-strata/</a>). However, I just had some questions and was hoping to see if where I’d got to so far looked right.</p>
<p>I have been using the w_month and wave variables to generate a new date variable of year-month. To capture calendar year, I have used the wave and w_month variables in the following manner:</p>
<p>gen year = 2009 if wave==1 & (month>0 & month<13)<br />replace year = 2010 if wave==1 & (month>12 & month<25)<br />replace year = 2010 if wave==2 & (month>0 & month<13)<br />replace year = 2011 if wave==2 & (month>12 & month<25)<br />replace year = 2011 if wave==3 & (month>0 & month<13)<br />…<br />replace year = 2021 if wave==13 & (month>0 & month<13)<br />replace year = 2022 if wave==13 & (month>12 & month<25)</p>
<p>To measure calendar month, I have recoded the w_month variable, combining the two monthly measures into one. So, in the w_month variable, it tells us whether someone was sampled in January in the year 1 sample or January in the year 2 sample. I’ve now combined these into a single category of whether someone was sampled in January. For example, ‘jan yr1’ and jan yr2’ are now just ‘jan’; ‘feb yr1’ and ‘feb yr2’ are now just ‘feb, etc.</p>
<p>With these new calendar year and calendar month variables, I have now created a new measure of calendar year-month, which looks like this (I hope this is correct so far):</p>
<pre><code>2009 Jan = 1<br /> 2009 Feb = 2<br /> 2009 Mar = 3<br /> 2009 Apr = 4<br /> 2009 May = 5<br /> 2009 June = 6<br /> 2009 July = 7<br />…<br /> 2022 June = 162<br /> 2022 July = 163<br /> 2022 Aug = 164<br /> 2022 Sep = 165<br /> 2022 Oct = 166<br /> 2022 Nov = 167<br /> 2022 Nov = 168</code></pre>
<p>I understand that whatever weight I choose to use I need to correct it due to Northern Ireland only being sampled in issue month 1-12 (and not 13-24). Therefore, I will apply the following adjustment to the weight (gen adj=1, replace adj=0.5 if w_country==4, gen weight=w_xxxyyus_lw*adj 8) as outlined in the online notes.</p>
<p>However, where I’ve become a little lost is what weights to initially use. In the notes, it states due to exceptions in sample selection ‘we recommend use of the us_lw weight in analysis’. Given my intention to look at calendar months up to wave 13, does this mean I should use the m_indpxus_lw weight? Is this the case, even if I just want to look at the data cross-sectionally (treat every calendar month as a cross-sectional picture of lkmove)? Because it seems that if I use m_indpxus_lw then it substantially reduces the sample size (due to these longitudinal weights requiring someone to have participated in every wave). Is it possible to use the cross-sectional weights for my aims, while excluding the BHPS and IEMB, as is suggested that one needs to do for this kind of calendar month analysis in the online notes? Or, do I need to use longitudinal weights for my intended analysis?</p>
<p>I was also just trying to get my head around the issue of scaling discussed in the online notes: ‘The weights provided are not designed directly for pooling data across waves as they are scaled to a mean value of 1.0 within each wave, and therefore produce different weighted sample sizes in each wave’, under the section ‘Pooling data from different waves for cross-sectional analysis.’ Firstly, I just wanted to confirm this applies to my case of doing monthly trends?</p>
<p>And secondly, if so, from what I can see, the syntax kindly provided is intended to produce an accurate weight to look at the variable jbstat for the calendar year 2011, using months 13-24 of wave 2 and 1-12 of wave 3. At the end, we get the weight variable weight2011, to use for weighting calendar year 2011. In my situation, I would like to do a longer running trend of values of lkmove by months. Would I need to create these weights for each calendar year I look at? So, for 2014, I would need to create a new cross-sectional weight using e_indpxub_xw and f_indpxub_xw (waves 5 and 6). For 2015, I would need to create a new cross-sectional weight using f_indpxub_xw and g_indpxub_xw (waves 6 and 7). For 2016, I would need to create a new cross-sectional weight using g_indpxub_xw and h_indpxub_xw (waves 7 and 8). And to follow this all the way to my last calendar year. Then, to look at monthly trends, treating the data as pooled cross-sectional, I would have my data in long-format and have a new weight variable made up of all these new calendar year weights I’ve created?</p>
<p>I was also wondering if it would be possible to include monthly lkmove data from the calendar year 2022 (using wave 13 of the UKHLS mainstage). As I understand things, previous calendar years (e.g., 2018) are composed of samples from two waves (waves 9 and 10 of the mainstage). However, for the calendar year of 2022, it is only composed of the sample from wave 13. Is it still possible to look at calendar month trends in lkmove for 2022? If so, would I need to make other sample restrictions to the other calendar years, for example, drop the IEMB sample from the trends? And would I need to make other adjustments to the weights? Or, is it not possible yet to look at monthly trends until wave 14 comes out)? I think from the online notes this is mentioned: ‘The analysis sample is only representative when all 24 monthly samples are combined in equal measure.’ Does this point refer to my question?</p>
<p>I am also interested in potentially looking at quarterly trends (Jan-Mar, Apr-Jun, etc.), instead of monthly trends (using the x_quarter variable). To do so, can I take the same approach as above? So, create a new time variable which is years divided into quarters (e.g., 2013 Jan-Mar, 2013 Apr-Jun, 2013 July-Sep, 2013 Oct-Dec, 2014 Jan-Mar, 2014 Apr-June…2022 Jul-Sep, 2022 Oct-Dec). Do I need to do anything different with the weights?</p>
<p>I hope this all makes sense.</p>
<p>Thanks so much in advance.</p>
<p>James</p> Understanding Society User Support - Support #2074 (In Progress): Longitudinal weights https://iserredex.essex.ac.uk/support/issues/20742024-03-09T16:03:06ZJoe Mattock
<p>Hi,</p>
<p>I'm conducting an analysis specifically over waves 2, 3, 6 and 9 for Understanding Society, as relating to the voteintent variable which is only included in these waves. I would just like to ask about the weighting procedure for this case. I am examining how an independent variable (gentrification, as measured by an index) affects voting intention at the LSOA-level.</p>
<p>My understanding is that I need to take the longitudinal weight from the final wave used in my analysis and apply it to all respondents (i_indscub_lw - I believe). However, given that my dependent variable of interest is not observed in consecutive waves, I wanted to ask whether this principle applies in the same way.</p>
<p>I also wanted to ask how this weighting would be applied in practice. I am slightly confused about the order of things. For example, would you remove all wave-specific prefixes, merge LSOA indicators with the Understanding Society data, and then apply the relevant weight for each respondent?</p>
<p>Much appreciated,</p>
<p>Joe</p> Understanding Society User Support - Support #2042 (Feedback): Survey Weights for Multi-Wave Pool...https://iserredex.essex.ac.uk/support/issues/20422024-01-29T09:47:05ZLisa Waddell
<p>Hello,</p>
<p>I have constructed an unconventional sample by pooling tab-delineated data files for SN 6614-Understanding Society: Waves 1-13, 2009-2022 and Harmonised BHPS: Waves 1-18, 1991-2009. I request your advice regarding these weights.</p>
<p>Sample Construction: Using the family matrix, I identify everyone in the sample with both a mother and a father pidp identified. Using all waves of data, I keep participants whose mother and father both responded when the participant was aged 10 or younger. I then filter by participants who responded at the age of 21 or older. These two filtering functions leave me with a sample of around ~2000 people between the ages 21-41, from BHPS and USoc samples. Due to my pooling of BHPS and USoc samples, when I follow the steps for constructing a tailored sample weight, I lose a substantial portion of my sample. For example, if I choose a base weight from Wave 1 of USoc, I lose the entire BHPS sample. If I choose a base weight from Wave 2 of USoc, I lose a substantial portion of the USoc sample.</p>
<p>Given how I construct my sample, do you have any advice on how I should be applying survey weights?</p>
<p>All the best,<br />Lisa</p> Understanding Society User Support - Support #2036 (Feedback): Understanding Society - weightshttps://iserredex.essex.ac.uk/support/issues/20362024-01-22T12:50:29ZValentina Di Iasio
<p>Good morning,</p>
<p>After reading the user guide and watch the short YouTube video, I am still confused on which are the correct weights I should select for my pooled cross-sectional analysis using Understanding Society.</p>
<p>I am using waves 6 and 9 for a pooled cross-section analysis. I would therefore being inclined in using the cross-sectional weights. However, when reading the user guide it says that cross-sectional weights should only be used when the analysis includes one wave only. I also read the paragraph on re-scaling the weights to use more waves to conduct cross-sectional analysis. However, I am not sure whether the described procedure would apply to my case since I don't have a year overlapping over the two waves (wave 6 goes from January 2014 to May 2016 while wave 9 goes from January 2017 to May 2019). Therefore I am not sure whether I should simply use cross-sectional weights, re-scale the cross-sectional weights somehow (maybe for the first 6 months of 2016 and 2019 only?), or exclude the first 6 months of the years 2016 and 2019. Or, if I am missing something and I should use longitudinal weights (in that case, since I am doing a pooled cross-section analysis, how should I deal with 0 weights?)</p>
<p>Thank you in advance</p>
<p>Valentina Di Iasio</p> Understanding Society User Support - Support #2031 (Feedback): Cross-Sectional Weighting Questions.https://iserredex.essex.ac.uk/support/issues/20312024-01-18T11:20:26ZIfraz Hussain
<p>Hi, I'm currently working on a cross-sectional study across waves to examine the proportion of children who live in couple-parent families where one parent reports any form of relationship distress.</p>
<p>I have three questions relating to weighting:</p>
<ul>
<li>From this analysis, I've seen changes to weighting across all previous waves and I would like to know what specifically led to the revisions?</li>
<li>Since I'm looking at participants across waves, I'm also interested in whether there is any attempt to mitigate attrition bias (e.g. changes to weighting)?</li>
<li>Given that I'm working with w_psnenui_xw weights for my study, Do you think this weighting is appropriate for examining this area of the USOC Survey data?</li>
</ul> Understanding Society User Support - Support #1890 (In Progress): Extracting PSU and Individual-L...https://iserredex.essex.ac.uk/support/issues/18902023-04-11T16:35:50ZLaurence Rowley-Abel
<p>Dear Understanding Society team,<br />I am running a multilevel model using individuals nested within census areas (LSOAs) in Waves 9, 10, 11 and 12. To account for clustering I am using the following levels in my multilevel model: individuals at the first level, PSUs at the second level and LSOAs at the third level. Therefore, from the provided weights, I need to extract separate weights for individuals and for PSUs. Having read your response here [[<a class="external" href="https://iserredex.essex.ac.uk/support/issues/1572">https://iserredex.essex.ac.uk/support/issues/1572</a>]], I am wondering if the below would be the correct approach:</p>
<p>- For the individual level, I would divide l_psnenus_xw by a_psnenus_xd (from the l_indall.dta and the a_indall.dta files respectively)<br />- For the PSU level, I would use a_psnenus_xd (from the a_indall.dta file)<br />- For the LSOA level, I would not be able to calculate a weight as it is not part of the sampling design. I would set this weight to 1 for all respondents.</p>
<p>Would this be correct? Additionally, would this mean I could only include respondents who were included at Wave 1, since I need to use the design weight (a_psnenus_xd) from Wave 1?</p>
<p>Many thanks for your help.</p>
<p>Best wishes,<br />Laurence</p> Understanding Society User Support - Support #1865 (Resolved): Changes to USOC wave data download...https://iserredex.essex.ac.uk/support/issues/18652023-02-23T16:42:31ZWilliam Shufflebottom
<p>Hi,</p>
<p>QUESTIONS</p>
<p>Q1: indscub_xw weight from wave 6 of USOC is present in our historical download of the wave 6 data but appears to be missing in the version of wave 6 we downloaded from UKData Service a few months ago and is also not listed as being in wave 6 on the USOC variable search page - can we confirm why only the indscui_xw weight is in the latest Wave 6 version, confirm it was in the original release, and if/when (and if so why) it was removed?</p>
<p>Q2: Our estimates run on the latest download of wave 1 to 12 of USOC are producing different numbers from the estimates we ran at the time of the previous wave's releases. Has there been a change to the data or weights (beyond wave 6 having a different weight) or how the weights work that could explain the difference we are seeing for all waves (bar wave 1 and wave 12) in a recent download of the data from all the waves. We are using the same weight (bar wave 6) and the same variable (sclfsat_7 in this case - but we use a range of USOC variables in our analysis).</p>
<p>BACKGROUND</p>
<p>We are producing estimates for the OECD and just discovered some differences for the estimates and CIs for the sclfsat7 variable when we re-ran historical estimates for all USOC waves 1 to 12. We run breakdowns for this variable (and others) by various domains when we update our publications and a new USOC wave has been released so we have the estimates from previous runs made at the time of USOC wave data release. We only ran the sclfsat7 variable again recently so there may be other changes.</p>
<p>We have a document for the weights to use for each variable which states that the indscub_xw weight is the correct weight to use for the sclfsat_7 variable in wave 6 but we noticed it was "missing" in the wave 6 data we downloaded around November from UK Data service (instead indscui_xw is present). As we are getting differences in our estimates and CIs for all waves (bar wave 1 and 12), this has prompted us to check with you if there have been changes made to the versions of the USOC main study wave data currently on the UK Data Service compared to what would have been available at the time each wave's data was released which could explain the differences we are seeing.</p>
<p>Your help is greatly appreciated as this has the potential to impact a lot of our publications and the current ad hoc we are working on</p> Understanding Society User Support - Support #1827 (Resolved): Correct weights to usehttps://iserredex.essex.ac.uk/support/issues/18272022-12-07T09:40:16ZAmelia Wattsamelia.watts678@outlook.com
<p>Dear Olena,</p>
<p>I have two questions regarding using weights. I’m trying to conduct a cross-sectional analysis using data from a UKHLS wave.</p>
<p>1) If I select a sub-sample using respondents interviewed in certain years/months within a wave, can I still use the existing cross-sectional weights, or will I need to make adjustments to the cross-sectional weights?</p>
<p>2) All the dependent and independent variables are from one wave (eg wave 5), apart from one independent variable which was measured at an earlier wave (eg wave 2). I will match respondents from wave 5 and wave 2 to obtain the values of this independent variable. In this case, can I still use the cross-sectional weights in wave 5, or should I use the longitudinal weights in wave 5?</p>
<p>Thank you for your help.</p> Understanding Society User Support - Support #1779 (Resolved): Calculating persistent povertyhttps://iserredex.essex.ac.uk/support/issues/17792022-10-08T19:09:52ZFacundo Herrera
<p>Dear Team,<br />I need to estimate persistent poverty rates following the standard definition that someone is persistently poor if she/he has been poor in the current year and in two of the last three years. The first question I have is regarding attrition: if the poor were more prone to attrition, are the longitudinal weights able to account for that? And my second question is about balanced panels: if I keep only those with income data in the last 4 waves, would I be affecting the representativeness of the sample? <br />Thanks a lot for your support,</p>
<p>Facundo</p> Understanding Society User Support - Support #1777 (Resolved): Creating Longitudinal Weights for ...https://iserredex.essex.ac.uk/support/issues/17772022-10-06T16:31:47ZJoAnn Tan
<p>I have a question similar to <a class="issue tracker-3 status-3 priority-4 priority-default" title="Support: Weight for unbalanced UKHLS panel data (Resolved)" href="https://iserredex.essex.ac.uk/support/issues/1257">#1257</a>.</p>
<p>In <a class="issue tracker-3 status-3 priority-4 priority-default" title="Support: Weight for unbalanced UKHLS panel data (Resolved)" href="https://iserredex.essex.ac.uk/support/issues/1257">#1257</a>, Alita mentioned that we can create longitudinal weights for unbalanced panel. How exactly can I do that? I am quite certain that my analysis (exploring the probability of being in temporary employment) is nothing complicated and hence does not require creating my own weights. However, I really want to run a longitudinal analysis with an UNBALANCED panel. Please help, thanks! (P/S: I have read all previous posts on creating weights for unbalanced panel but I am still not sure how creating longitudinal weights for unbalanced panel could be done.)</p> Understanding Society User Support - Support #1743 (Resolved): Averaging regional data to obtain ...https://iserredex.essex.ac.uk/support/issues/17432022-08-05T10:14:30ZCarolin Schmidtcs2100@cam.ac.uk
<p>Hi there,</p>
<p>I am using wave 6 to study household heads' homeownership probabilities. I am looking at native Brits and immigrants (I came up with an immigrant dummy for every household head).</p>
<p>I would now like to generate a control variable for each of my household heads: the variable should reflect the proportion of immigrants in the UK region where the person resides (that is, every household head in e.g London will have the same immigrant share attached, etc.). I am wondering how I should calculate that average: does it have to be weighted (i.e. egen immishare = wtmean(immigrant), weight(indscui_xw) by(region) using the gwtmean package which calculates weighted statistics)? I would think so, because without weighting it, I would have an average immigrant share based on the (not-per-se representative) raw data. However, if I calculate a weighted mean, then I would effectively double-weight the data because the regression itself would be weighted too, no?</p>
<p>I am unsure how to proceed and would appreciate any help.</p>
<p>Best wishes,<br />Carolin</p> Understanding Society User Support - Support #1726 (Resolved): BHPS and Understanding Society - w...https://iserredex.essex.ac.uk/support/issues/17262022-07-13T16:58:28ZMaria Petrillo
<p>Hi,<br />I am using both the BHPS (wave 1-18) and the Understanding Society (wave 1-11) to conduct a descriptive analysis on episodes of caring over time. I would like to know what weights should I be using in this case of both a cross-section analysis and a longitudinal one. In case of a cross section analysis it seems to me that I can use xrwtuk1 for waves BH12 to BH18 and indinub_xw from wave 2 to 11. But what about all the other waves? Could you please let me know what is the best approach?</p> Understanding Society User Support - Support #1632 (Resolved): Correct weighting for mental healthhttps://iserredex.essex.ac.uk/support/issues/16322022-01-17T14:23:19ZJoe Lillis
<p>Hello all,</p>
<p>Thanks for taking the time to read my post.</p>
<p>I've recently carried out an analysis on adolescent mental health, across three waves.</p>
<p>Study design as follows: Information on 16-21 year olds at Wave 6 (n=1,748)from indresp.dta, and again, at Wave 9.</p>
<p>Covariates on previous mental health and bullying were included from waves 1, 3 and 5 (n=1,073, or 59% of original sample)of the youth survey.</p>
<p>The outcome measure was GHQ-12 scores at wave 6 and 9.Have included the pdf of the paper as is for further information.</p>
<p>My question is, what weight to use? Do USoc weights account for attrition by mental health (GHQ-12)?</p>
<p>Happy to give further detail if needed!</p>
<p>All the best,<br />Joe</p> Understanding Society User Support - Support #1622 (Resolved): Creating my own longitudinal weighthttps://iserredex.essex.ac.uk/support/issues/16222021-12-14T16:38:36ZKate Dotsikas
<p>I am running a linear regression using participants who have responded in waves 9 and 10. I understand that using the wave 10 longitudinal weight will drop individuals from my analysis who haven't responded to all of the preceding waves. From the weighting FAQ, I understand I can adjust a cross-sectional weight myself to account for the non-response in my analysis. However I'm wondering how to go about this - which cross-sectional weight do I take as a base, and from what population do I derive the weight? As I am dropping anyone not responding with a full interview in wave 9 and 10, how can I estimate the probability of non-response between these waves as everyone in my sample has responded to both? Thank you in advance for your help.</p> Understanding Society User Support - Support #848 (Closed): Clinical Depression H_COND variableshttps://iserredex.essex.ac.uk/support/issues/8482017-09-04T15:55:51ZLuca Bernardiluca.bernardi@uab.cat
<p>Dear Support group,</p>
<p>I am measuring clinical depression and I would kindly need your advice on a couple of questions. I apologise sincerely for putting immediate priority on this, but your answer might also have implications for a paper I am co-authoring within the Understanding Society EU Referendum project and we have a deadline shortly for submitting the paper.</p>
<p>As I am interested in objective depression, I was using the questions H_COND17 and H_CONDS17 to create a measure of depression. What I was doing is to assign value 1 to respondents who replied that they still have depression in H_CONDS17=Yes (as I am interested in the effects of depression, I do not care much if the person was diagnosed with depression at some point in his/her life - i.e. H_COND17=Yes - but rather it is important that the person is depressed at the time of the interview). I assign value 0 if the respondent mentioned that he/she has never been diagnosed with depression in H_COND17=No.</p>
<p>So far I was using data from waves 1, and 3 to 6 as I noticed that these two variables are available in all waves but wave 2 (<a class="external" href="https://www.understandingsociety.ac.uk/documentation/mainstage/dataset-documentation/wave/2/datafile/b_indresp">https://www.understandingsociety.ac.uk/documentation/mainstage/dataset-documentation/wave/2/datafile/b_indresp</a>), where instead a slightly different question is asked: H_CONDN17. In turn, this question is not available in all waves and sometimes is asked together with the previous two questions (e.g., <a class="external" href="https://www.understandingsociety.ac.uk/documentation/mainstage/dataset-documentation/wave/4/datafile/d_indresp">https://www.understandingsociety.ac.uk/documentation/mainstage/dataset-documentation/wave/4/datafile/d_indresp</a>).</p>
<p>My questions thus are the following. Do you please know what is the reason of such a variation and, more importantly, can I "maximise" my number of depressives by creating a measure of depression that combines both sets of questions (i.e., H_COND17 and H_CONDS17, and H_CONDN17) and makes use of all available waves (i.e. 1 to 6)?</p>
<p>My idea was to do the following:</p>
<p>gen depression = .</p>
<p>replace depression = 1 if hconds17==1 | hcondn17==1</p>
<p>replace depression = 0 if hcond17==0 | hcondn17==0</p>
<p>However, I wonder how problematic can be mixing questions that are not available in all waves, as this is certainly a point that reviewers will raise. I would really appreciate your thoughts on this.</p>
<p>Many thanks and best wishes,<br />Luca</p>