Understanding Society User Support: Issueshttps://iserredex.essex.ac.uk/support/https://iserredex.essex.ac.uk/support/support/favicon.ico?15995719382024-03-13T15:36:15ZUnderstanding Society User Support
Redmine Understanding Society User Support - Support #2075 (Feedback): Using UKHLS to look at trends acro...https://iserredex.essex.ac.uk/support/issues/20752024-03-13T15:36:15ZJames Laurence
<p>Hi there,</p>
<p>I am interested in looking at calendar month trends in whether someone wants to move home or not (which is available in every wave): lkmove. Ideally, I would like to look at trends using all waves (1-13). However, if it is easier to look at trends from some other start point, e.g.. 2016 or 2017, then I am flexible. I am also flexible as to whether the BHPS sample is included or not. This will be cross-sectional analysis, so I hope to treat each calendar month as a cross-section (I won’t be doing any longitudinal analysis).</p>
<p>I have been reading the helpful notes on ‘Running analysis on a calendar year or month’ (<a class="external" href="https://www.understandingsociety.ac.uk/documentation/mainstage/user-guides/main-survey-user-guide/how-to-use-weights-analysis-guidance-for-weights-psu-strata/">https://www.understandingsociety.ac.uk/documentation/mainstage/user-guides/main-survey-user-guide/how-to-use-weights-analysis-guidance-for-weights-psu-strata/</a>). However, I just had some questions and was hoping to see if where I’d got to so far looked right.</p>
<p>I have been using the w_month and wave variables to generate a new date variable of year-month. To capture calendar year, I have used the wave and w_month variables in the following manner:</p>
<p>gen year = 2009 if wave==1 & (month>0 & month<13)<br />replace year = 2010 if wave==1 & (month>12 & month<25)<br />replace year = 2010 if wave==2 & (month>0 & month<13)<br />replace year = 2011 if wave==2 & (month>12 & month<25)<br />replace year = 2011 if wave==3 & (month>0 & month<13)<br />…<br />replace year = 2021 if wave==13 & (month>0 & month<13)<br />replace year = 2022 if wave==13 & (month>12 & month<25)</p>
<p>To measure calendar month, I have recoded the w_month variable, combining the two monthly measures into one. So, in the w_month variable, it tells us whether someone was sampled in January in the year 1 sample or January in the year 2 sample. I’ve now combined these into a single category of whether someone was sampled in January. For example, ‘jan yr1’ and jan yr2’ are now just ‘jan’; ‘feb yr1’ and ‘feb yr2’ are now just ‘feb, etc.</p>
<p>With these new calendar year and calendar month variables, I have now created a new measure of calendar year-month, which looks like this (I hope this is correct so far):</p>
<pre><code>2009 Jan = 1<br /> 2009 Feb = 2<br /> 2009 Mar = 3<br /> 2009 Apr = 4<br /> 2009 May = 5<br /> 2009 June = 6<br /> 2009 July = 7<br />…<br /> 2022 June = 162<br /> 2022 July = 163<br /> 2022 Aug = 164<br /> 2022 Sep = 165<br /> 2022 Oct = 166<br /> 2022 Nov = 167<br /> 2022 Nov = 168</code></pre>
<p>I understand that whatever weight I choose to use I need to correct it due to Northern Ireland only being sampled in issue month 1-12 (and not 13-24). Therefore, I will apply the following adjustment to the weight (gen adj=1, replace adj=0.5 if w_country==4, gen weight=w_xxxyyus_lw*adj 8) as outlined in the online notes.</p>
<p>However, where I’ve become a little lost is what weights to initially use. In the notes, it states due to exceptions in sample selection ‘we recommend use of the us_lw weight in analysis’. Given my intention to look at calendar months up to wave 13, does this mean I should use the m_indpxus_lw weight? Is this the case, even if I just want to look at the data cross-sectionally (treat every calendar month as a cross-sectional picture of lkmove)? Because it seems that if I use m_indpxus_lw then it substantially reduces the sample size (due to these longitudinal weights requiring someone to have participated in every wave). Is it possible to use the cross-sectional weights for my aims, while excluding the BHPS and IEMB, as is suggested that one needs to do for this kind of calendar month analysis in the online notes? Or, do I need to use longitudinal weights for my intended analysis?</p>
<p>I was also just trying to get my head around the issue of scaling discussed in the online notes: ‘The weights provided are not designed directly for pooling data across waves as they are scaled to a mean value of 1.0 within each wave, and therefore produce different weighted sample sizes in each wave’, under the section ‘Pooling data from different waves for cross-sectional analysis.’ Firstly, I just wanted to confirm this applies to my case of doing monthly trends?</p>
<p>And secondly, if so, from what I can see, the syntax kindly provided is intended to produce an accurate weight to look at the variable jbstat for the calendar year 2011, using months 13-24 of wave 2 and 1-12 of wave 3. At the end, we get the weight variable weight2011, to use for weighting calendar year 2011. In my situation, I would like to do a longer running trend of values of lkmove by months. Would I need to create these weights for each calendar year I look at? So, for 2014, I would need to create a new cross-sectional weight using e_indpxub_xw and f_indpxub_xw (waves 5 and 6). For 2015, I would need to create a new cross-sectional weight using f_indpxub_xw and g_indpxub_xw (waves 6 and 7). For 2016, I would need to create a new cross-sectional weight using g_indpxub_xw and h_indpxub_xw (waves 7 and 8). And to follow this all the way to my last calendar year. Then, to look at monthly trends, treating the data as pooled cross-sectional, I would have my data in long-format and have a new weight variable made up of all these new calendar year weights I’ve created?</p>
<p>I was also wondering if it would be possible to include monthly lkmove data from the calendar year 2022 (using wave 13 of the UKHLS mainstage). As I understand things, previous calendar years (e.g., 2018) are composed of samples from two waves (waves 9 and 10 of the mainstage). However, for the calendar year of 2022, it is only composed of the sample from wave 13. Is it still possible to look at calendar month trends in lkmove for 2022? If so, would I need to make other sample restrictions to the other calendar years, for example, drop the IEMB sample from the trends? And would I need to make other adjustments to the weights? Or, is it not possible yet to look at monthly trends until wave 14 comes out)? I think from the online notes this is mentioned: ‘The analysis sample is only representative when all 24 monthly samples are combined in equal measure.’ Does this point refer to my question?</p>
<p>I am also interested in potentially looking at quarterly trends (Jan-Mar, Apr-Jun, etc.), instead of monthly trends (using the x_quarter variable). To do so, can I take the same approach as above? So, create a new time variable which is years divided into quarters (e.g., 2013 Jan-Mar, 2013 Apr-Jun, 2013 July-Sep, 2013 Oct-Dec, 2014 Jan-Mar, 2014 Apr-June…2022 Jul-Sep, 2022 Oct-Dec). Do I need to do anything different with the weights?</p>
<p>I hope this all makes sense.</p>
<p>Thanks so much in advance.</p>
<p>James</p> Understanding Society User Support - Support #2074 (In Progress): Longitudinal weights https://iserredex.essex.ac.uk/support/issues/20742024-03-09T16:03:06ZJoe Mattock
<p>Hi,</p>
<p>I'm conducting an analysis specifically over waves 2, 3, 6 and 9 for Understanding Society, as relating to the voteintent variable which is only included in these waves. I would just like to ask about the weighting procedure for this case. I am examining how an independent variable (gentrification, as measured by an index) affects voting intention at the LSOA-level.</p>
<p>My understanding is that I need to take the longitudinal weight from the final wave used in my analysis and apply it to all respondents (i_indscub_lw - I believe). However, given that my dependent variable of interest is not observed in consecutive waves, I wanted to ask whether this principle applies in the same way.</p>
<p>I also wanted to ask how this weighting would be applied in practice. I am slightly confused about the order of things. For example, would you remove all wave-specific prefixes, merge LSOA indicators with the Understanding Society data, and then apply the relevant weight for each respondent?</p>
<p>Much appreciated,</p>
<p>Joe</p> Understanding Society User Support - Support #2060 (Resolved): Design weights taken account of in...https://iserredex.essex.ac.uk/support/issues/20602024-02-27T13:21:37ZRosie Cornish
<p>I think the answer to this is yes, but can you confirm that the household enumeration weights (e.g. a_hhdenus_xw) take account of the design weights - i.e. they are the product of the design weight and a household response weight?</p> Understanding Society User Support - Support #2058 (Resolved): Using longitudinal weights when co...https://iserredex.essex.ac.uk/support/issues/20582024-02-22T16:48:24ZJames Laurence
<p>Hi there,</p>
<p>I was just hoping to get some more advice regarding correctly weighting my analysis combining the mainstage and Covid-19 waves of the UKHLS. You kindly helped with a previous weighting issue I had for treating the data as repeated cross-sections. However, I am also hoping to conduct some fixed effects panel data analysis of the combined mainstage and Covid-19 waves (web survey only).</p>
<p>As a basic set-up, I am combining wave 9 of the UKHLS mainstage survey (the last mainstage survey that doesn’t cover the pandemic) with waves 1 to 9 of the COVID-19 survey. The data are in long format. As I would like to do some fixed effects longitudinal analysis, I believe I need to use the longitudinal weights. From my reading, I need to choose the longitudinal weight from the last wave of the survey I will be using – in this case wave 9 of the Covid-19 survey: ci_betaindin_lw</p>
<p>Applying this weight [ci_betaindin_lw] will give me a balanced panel, restricting the sample to everyone who participated in all 9-waves of the Covid-19 survey. However, I would also like to analyse wave 9 of the mainstage survey as part of a longitudinal, fixed effects analysis covering mainstage wave 9 and Covid survey waves 1-9. Is this possible? If so, is one approach to feed back the ci_betaindin_lw weight so that the people who were in wave 9 of the mainstage survey who were also present in all 9-waves of the Covid-19 survey have the weight value of ci_betaindin_lw? Therefore, the ci_betaindin_lw weight would cover the mainstage wave 9 sample and the Covid-19 sample.</p>
<p>In case it’s not clear, to make-up an example of the data in long-format, which contains wave 9 of the mainstage survey and waves 1-9 of the Covid survey. Pidp no. 111111 was present in wave 9 of the mainstage sirvey and all 9 waves of the Covid survey and had a value of 1.5 for their longitudinal weight at wave 9 of the covid survey (ci_betaindin_lw). So, my data would just look like this:</p>
<p><strong>[PIDP]</strong> <strong>[WAVE] [Value of ci_betaindin_lw]</strong><br />111111 Mainstage wave 9 <em>Missing Value</em><br />111111 COVID wave 1 1.5<br />111111 COVID wave 2 1.5<br />111111 COVID wave 3 1.5<br />111111 COVID wave 4 1.5<br />111111 COVID wave 5 1.5<br />111111 COVID wave 6 1.5<br />111111 COVID wave 7 1.5<br />111111 COVID wave 8 1.5<br />111111 COVID wave 9 1.5</p>
<p>Is just feeding back the value of ci_betaindin_lw (1.5) what I need to do? So, it would now look like:</p>
<p><strong>[PIDP]</strong> <strong>[WAVE] [Value of ci_betaindin_lw]</strong><br />111111 Mainstage wave 9 <strong>1.5</strong><br />111111 COVID wave 1 1.5<br />111111 COVID wave 2 1.5<br />111111 COVID wave 3 1.5<br />111111 COVID wave 4 1.5<br />111111 COVID wave 5 1.5<br />111111 COVID wave 6 1.5<br />111111 COVID wave 7 1.5<br />111111 COVID wave 8 1.5<br />111111 COVID wave 9 1.5</p>
<p>If so, could this method apply if I wanted to include more mainstage waves of data? So, if I wanted to include waves 6, 7, 8 and wave 9 of the mainstage survey alongside waves 1-9 of the Covid survey - would I just feed back an individuals' weight value for ci_betaindin_lw back so the individual have that weight value for mainstage waves, 6, 7, 8 and 9?</p>
<p>I may be completely misunderstanding how to use the longitudinal weights, or have missed something crucial meaning you can't applying the Covid longitudinal weights to the pre-Covid survey mainstage waves. If so, apologies in advance and any advice would be hugely appreciated.</p>
<p>Best wishes,</p>
<p>James</p> Understanding Society User Support - Support #2042 (Feedback): Survey Weights for Multi-Wave Pool...https://iserredex.essex.ac.uk/support/issues/20422024-01-29T09:47:05ZLisa Waddell
<p>Hello,</p>
<p>I have constructed an unconventional sample by pooling tab-delineated data files for SN 6614-Understanding Society: Waves 1-13, 2009-2022 and Harmonised BHPS: Waves 1-18, 1991-2009. I request your advice regarding these weights.</p>
<p>Sample Construction: Using the family matrix, I identify everyone in the sample with both a mother and a father pidp identified. Using all waves of data, I keep participants whose mother and father both responded when the participant was aged 10 or younger. I then filter by participants who responded at the age of 21 or older. These two filtering functions leave me with a sample of around ~2000 people between the ages 21-41, from BHPS and USoc samples. Due to my pooling of BHPS and USoc samples, when I follow the steps for constructing a tailored sample weight, I lose a substantial portion of my sample. For example, if I choose a base weight from Wave 1 of USoc, I lose the entire BHPS sample. If I choose a base weight from Wave 2 of USoc, I lose a substantial portion of the USoc sample.</p>
<p>Given how I construct my sample, do you have any advice on how I should be applying survey weights?</p>
<p>All the best,<br />Lisa</p> Understanding Society User Support - Support #2040 (Resolved): Survey Weightshttps://iserredex.essex.ac.uk/support/issues/20402024-01-25T10:27:40ZMartha Tindall
<p>Hi</p>
<p>I am conducting an analysis an dam struggling to determine the best weights to use and was hoping you could give me some guidance. My analysis uses data from the years 2018 to 2021 (inclusive) to conduct a TWFE linear model. My model includes a main effects and interaction term involving a binary variable for pre-pandemic and during-pandemic. I have the following questions regarding weighting.</p>
<p>1. Currently my pandemic cut off is March 2020, given the term of interest involves time, is it necessary to start the year 2018 in March and extend the data to March 2022 to ensure there is equal representation of sample months in each group, or is it okay to just go January 2018 to December 2021 (keeping the cut off in March 2020)?</p>
<p>2. I wish to use an unbalanced panel design as the subgroups I when I use longitudinal weights, my sample becomes just 12% of what it would be using an unbalanced panel. My question is how do I choose these weights? Guidance on the Understanding Society website is for creating balanced panels and using _lw weights, however in my situation this is not possible. Is it appropriate to apply the cross sectional weight for each observation in a given wave or is there something else I should be doing?</p>
<p>3. On the Understanding Society website you mention rescaling of weights for analysing by calendar year. First, is this required in my situation? Second, do you provide guidance for doing so in R as the only advice available is for stata which I am not familiar with.</p>
<p>Thank you in advance for your time and please let me know if you need any more information from me.</p>
<p>Martha</p> Understanding Society User Support - Support #2036 (Feedback): Understanding Society - weightshttps://iserredex.essex.ac.uk/support/issues/20362024-01-22T12:50:29ZValentina Di Iasio
<p>Good morning,</p>
<p>After reading the user guide and watch the short YouTube video, I am still confused on which are the correct weights I should select for my pooled cross-sectional analysis using Understanding Society.</p>
<p>I am using waves 6 and 9 for a pooled cross-section analysis. I would therefore being inclined in using the cross-sectional weights. However, when reading the user guide it says that cross-sectional weights should only be used when the analysis includes one wave only. I also read the paragraph on re-scaling the weights to use more waves to conduct cross-sectional analysis. However, I am not sure whether the described procedure would apply to my case since I don't have a year overlapping over the two waves (wave 6 goes from January 2014 to May 2016 while wave 9 goes from January 2017 to May 2019). Therefore I am not sure whether I should simply use cross-sectional weights, re-scale the cross-sectional weights somehow (maybe for the first 6 months of 2016 and 2019 only?), or exclude the first 6 months of the years 2016 and 2019. Or, if I am missing something and I should use longitudinal weights (in that case, since I am doing a pooled cross-section analysis, how should I deal with 0 weights?)</p>
<p>Thank you in advance</p>
<p>Valentina Di Iasio</p> Understanding Society User Support - Support #2031 (Feedback): Cross-Sectional Weighting Questions.https://iserredex.essex.ac.uk/support/issues/20312024-01-18T11:20:26ZIfraz Hussain
<p>Hi, I'm currently working on a cross-sectional study across waves to examine the proportion of children who live in couple-parent families where one parent reports any form of relationship distress.</p>
<p>I have three questions relating to weighting:</p>
<ul>
<li>From this analysis, I've seen changes to weighting across all previous waves and I would like to know what specifically led to the revisions?</li>
<li>Since I'm looking at participants across waves, I'm also interested in whether there is any attempt to mitigate attrition bias (e.g. changes to weighting)?</li>
<li>Given that I'm working with w_psnenui_xw weights for my study, Do you think this weighting is appropriate for examining this area of the USOC Survey data?</li>
</ul> Understanding Society User Support - Support #2012 (Resolved): longitudinal weighthttps://iserredex.essex.ac.uk/support/issues/20122023-12-14T15:03:18ZMargherita Agnoletto
<p>Dear Understanding Society Team,</p>
<p>I am currently examining the relationship between flexible work arrangements (FWA) and some employees' outcomes.</p>
<p>Given that questions about FWA are asked every two waves, I have chosen to conduct a longitudinal analysis (FE) using waves 2, 4, 6, 8, and 10. Some of my outcomes come from the self-completion questionnaire. <br />As I understand, it is recommended to use the appropriate longitudinal weight from the last wave in my analysis (i.e. i_indinus_lw). However, I observe a significant loss of observations. <br />Given that my panel is unbalanced, could I use the corresponding longitudinal weight from the last available wave for each individual? For instance, if an individual 'i' has information until wave 8, I propose imputing the appropriate longitudinal weight from wave 8. Similarly, if individual 'k' has information until wave 6, I suggest imputing the weight from wave 6.</p>
<p>Thank you for your attention.</p>
<p>Kind regards</p> Understanding Society User Support - Support #2006 (Resolved): Longitudinal analysis using calend...https://iserredex.essex.ac.uk/support/issues/20062023-12-12T13:52:21ZMarina Kousta
<p>Hello,</p>
<p>I am reaching out to kindly request help on how to conduct longitudinal analysis using calendar year datasets.<br />1) Although online you state the published calendar year data are meant to be used for cross-sectional analysis, does that also stand for when we create our own calendar year datasets? Or is it meant to be a guidance only for when you release the pre-made calendar year data? If that is the case regardless, is there some way for us to still conduct longitudinal analysis after creating our own calendar year data?<br />2) Although you recommend using the w_month (sample month) to create calendar year data, would it still be ok to instead use the interview date instead, when the exact date is of great importance to the research question itself (i.e. when testing the introduction or removal of a social policy).</p>
<p>Many thanks in advance for your time and consideration.</p>
<p>Best wishes,<br />Marina</p> Understanding Society User Support - Support #2004 (Resolved): Selection of weightshttps://iserredex.essex.ac.uk/support/issues/20042023-12-11T16:11:12ZJoanna Clifton-SpriggJ.M.Clifton-Sprigg@bath.ac.uk
<p>Hello,</p>
<p>I am looking to use information on new parents (newmum/newdad), specifically dates of leave taken when child was born, in a difference in difference approach around the shared parental leave reform (2015).</p>
<p>Essentially, I will be comparing cohorts of parents who had a child before & after the reform. I will not be following specific parents longitudinally, at least not for the first part of the project.</p>
<p>I would like to run this analysis in calendar years, not waves, given that the reform happened in April 2015 & I will be comparing those with children born pre-April 2015 and post.</p>
<p>I have pooled waves 2-12 data and set this up in a long format. Now I am wondering what weights to apply.</p>
<p>1) Am I correct in thinking in this scenario cross-sectional weights will work? I would like to preserve as big a sample as possible as even without weighting sample size is a challenge.</p>
<p>2) If I can use cross-sectional weights, how can I apply them to this pooled data file, which includes waves 2-12? It is not clear to me from the user guide.</p>
<p>3) At which stage do I adjust for the calendar year analysis?</p>
<p>Thank you.</p> Understanding Society User Support - Support #1985 (Resolved): Representativeness of housing tenu...https://iserredex.essex.ac.uk/support/issues/19852023-10-24T13:13:42ZEoghan O'Brien
<p>I am looking at wave 11 responses in the hhresp table for the breakdown of housing tenure (tenure_dv) at the household level.</p>
<p>The screenshots attached include the % of each category (unweighted and weighted using "hhdenui_xw").</p>
<p>Comparing these figures with census results for tenure status in England and Wales (% of households by tenure), it appears that the number of private renters (in USoc "Rented private unfurnished" and "Rented private furnished" appears to be under represented (11.7% when weighted) relative to the census figures for England and Wales in 2021 (20.3%). I have tried limiting the USoc sample to just England and Wales household, but it does not materially change the results.</p>
<p>Link to census data here: <a class="external" href="https://www.ons.gov.uk/peoplepopulationandcommunity/housing/bulletins/housingenglandandwales/census2021">https://www.ons.gov.uk/peoplepopulationandcommunity/housing/bulletins/housingenglandandwales/census2021</a></p>
<p>Any info on why I may be finding this discrepancy would be very much appreciated.</p> Understanding Society User Support - Support #1982 (Resolved): reference person weights https://iserredex.essex.ac.uk/support/issues/19822023-10-12T11:19:12ZAmelia Wattsamelia.watts678@outlook.com
<p>Dear Olena/support team,</p>
<p>I'm selecting reference persons from households across waves to form a panel. Can the individual longitudinal weights for these respondents in the last wave be used as suboptimal weights in the analysis?</p>
<p>Many thanks, <br />Amelia</p> Understanding Society User Support - Support #1975 (Resolved): Weights - Cross-sectional Analysis...https://iserredex.essex.ac.uk/support/issues/19752023-09-19T09:55:04ZCaitlin Schmid
<p>Good morning,</p>
<p>Using the main survey, I aim to run a cross-sectional analysis on a number of variables to analyse sex differences between adults and their variation across Local Authority Districts. To increase the sample sizes, I want to pool UKHLS Waves 11 and 12. Do I require tailored weights or can I proceed with the two provided cross-sectional adults weights of the respective waves (_indinui_xw)?</p>
<p>Many thanks and best wishes,</p>
<p>Caitlin</p> Understanding Society User Support - Support #1908 (Resolved): Weights using the BHPS Consolidate...https://iserredex.essex.ac.uk/support/issues/19082023-05-26T21:16:07ZNatalia Carralero
<p>Hello. I am studying differences in single/partnered parents. To do so, I am using the British Household Panel Survey Consolidated Marital, Cohabitation and Fertility Histories (1991-2009) to identify my sample of single/non-single parents, and then, merging it with the BHPS individual questionnaire to get the relevant variables.<br />My question is, which weights should I be using? I was thinking on indin91_lw, but I am not entirely sure. <br />Besides, which type of weights are they? Frequency or analytic weights? <br />Thank you!</p>