Understanding Society User Support: Issueshttps://iserredex.essex.ac.uk/support/https://iserredex.essex.ac.uk/support/support/favicon.ico?15995719382024-03-13T15:36:15ZUnderstanding Society User Support
Redmine Understanding Society User Support - Support #2075 (Feedback): Using UKHLS to look at trends acro...https://iserredex.essex.ac.uk/support/issues/20752024-03-13T15:36:15ZJames Laurence
<p>Hi there,</p>
<p>I am interested in looking at calendar month trends in whether someone wants to move home or not (which is available in every wave): lkmove. Ideally, I would like to look at trends using all waves (1-13). However, if it is easier to look at trends from some other start point, e.g.. 2016 or 2017, then I am flexible. I am also flexible as to whether the BHPS sample is included or not. This will be cross-sectional analysis, so I hope to treat each calendar month as a cross-section (I won’t be doing any longitudinal analysis).</p>
<p>I have been reading the helpful notes on ‘Running analysis on a calendar year or month’ (<a class="external" href="https://www.understandingsociety.ac.uk/documentation/mainstage/user-guides/main-survey-user-guide/how-to-use-weights-analysis-guidance-for-weights-psu-strata/">https://www.understandingsociety.ac.uk/documentation/mainstage/user-guides/main-survey-user-guide/how-to-use-weights-analysis-guidance-for-weights-psu-strata/</a>). However, I just had some questions and was hoping to see if where I’d got to so far looked right.</p>
<p>I have been using the w_month and wave variables to generate a new date variable of year-month. To capture calendar year, I have used the wave and w_month variables in the following manner:</p>
<p>gen year = 2009 if wave==1 & (month>0 & month<13)<br />replace year = 2010 if wave==1 & (month>12 & month<25)<br />replace year = 2010 if wave==2 & (month>0 & month<13)<br />replace year = 2011 if wave==2 & (month>12 & month<25)<br />replace year = 2011 if wave==3 & (month>0 & month<13)<br />…<br />replace year = 2021 if wave==13 & (month>0 & month<13)<br />replace year = 2022 if wave==13 & (month>12 & month<25)</p>
<p>To measure calendar month, I have recoded the w_month variable, combining the two monthly measures into one. So, in the w_month variable, it tells us whether someone was sampled in January in the year 1 sample or January in the year 2 sample. I’ve now combined these into a single category of whether someone was sampled in January. For example, ‘jan yr1’ and jan yr2’ are now just ‘jan’; ‘feb yr1’ and ‘feb yr2’ are now just ‘feb, etc.</p>
<p>With these new calendar year and calendar month variables, I have now created a new measure of calendar year-month, which looks like this (I hope this is correct so far):</p>
<pre><code>2009 Jan = 1<br /> 2009 Feb = 2<br /> 2009 Mar = 3<br /> 2009 Apr = 4<br /> 2009 May = 5<br /> 2009 June = 6<br /> 2009 July = 7<br />…<br /> 2022 June = 162<br /> 2022 July = 163<br /> 2022 Aug = 164<br /> 2022 Sep = 165<br /> 2022 Oct = 166<br /> 2022 Nov = 167<br /> 2022 Nov = 168</code></pre>
<p>I understand that whatever weight I choose to use I need to correct it due to Northern Ireland only being sampled in issue month 1-12 (and not 13-24). Therefore, I will apply the following adjustment to the weight (gen adj=1, replace adj=0.5 if w_country==4, gen weight=w_xxxyyus_lw*adj 8) as outlined in the online notes.</p>
<p>However, where I’ve become a little lost is what weights to initially use. In the notes, it states due to exceptions in sample selection ‘we recommend use of the us_lw weight in analysis’. Given my intention to look at calendar months up to wave 13, does this mean I should use the m_indpxus_lw weight? Is this the case, even if I just want to look at the data cross-sectionally (treat every calendar month as a cross-sectional picture of lkmove)? Because it seems that if I use m_indpxus_lw then it substantially reduces the sample size (due to these longitudinal weights requiring someone to have participated in every wave). Is it possible to use the cross-sectional weights for my aims, while excluding the BHPS and IEMB, as is suggested that one needs to do for this kind of calendar month analysis in the online notes? Or, do I need to use longitudinal weights for my intended analysis?</p>
<p>I was also just trying to get my head around the issue of scaling discussed in the online notes: ‘The weights provided are not designed directly for pooling data across waves as they are scaled to a mean value of 1.0 within each wave, and therefore produce different weighted sample sizes in each wave’, under the section ‘Pooling data from different waves for cross-sectional analysis.’ Firstly, I just wanted to confirm this applies to my case of doing monthly trends?</p>
<p>And secondly, if so, from what I can see, the syntax kindly provided is intended to produce an accurate weight to look at the variable jbstat for the calendar year 2011, using months 13-24 of wave 2 and 1-12 of wave 3. At the end, we get the weight variable weight2011, to use for weighting calendar year 2011. In my situation, I would like to do a longer running trend of values of lkmove by months. Would I need to create these weights for each calendar year I look at? So, for 2014, I would need to create a new cross-sectional weight using e_indpxub_xw and f_indpxub_xw (waves 5 and 6). For 2015, I would need to create a new cross-sectional weight using f_indpxub_xw and g_indpxub_xw (waves 6 and 7). For 2016, I would need to create a new cross-sectional weight using g_indpxub_xw and h_indpxub_xw (waves 7 and 8). And to follow this all the way to my last calendar year. Then, to look at monthly trends, treating the data as pooled cross-sectional, I would have my data in long-format and have a new weight variable made up of all these new calendar year weights I’ve created?</p>
<p>I was also wondering if it would be possible to include monthly lkmove data from the calendar year 2022 (using wave 13 of the UKHLS mainstage). As I understand things, previous calendar years (e.g., 2018) are composed of samples from two waves (waves 9 and 10 of the mainstage). However, for the calendar year of 2022, it is only composed of the sample from wave 13. Is it still possible to look at calendar month trends in lkmove for 2022? If so, would I need to make other sample restrictions to the other calendar years, for example, drop the IEMB sample from the trends? And would I need to make other adjustments to the weights? Or, is it not possible yet to look at monthly trends until wave 14 comes out)? I think from the online notes this is mentioned: ‘The analysis sample is only representative when all 24 monthly samples are combined in equal measure.’ Does this point refer to my question?</p>
<p>I am also interested in potentially looking at quarterly trends (Jan-Mar, Apr-Jun, etc.), instead of monthly trends (using the x_quarter variable). To do so, can I take the same approach as above? So, create a new time variable which is years divided into quarters (e.g., 2013 Jan-Mar, 2013 Apr-Jun, 2013 July-Sep, 2013 Oct-Dec, 2014 Jan-Mar, 2014 Apr-June…2022 Jul-Sep, 2022 Oct-Dec). Do I need to do anything different with the weights?</p>
<p>I hope this all makes sense.</p>
<p>Thanks so much in advance.</p>
<p>James</p> Understanding Society User Support - Support #2074 (In Progress): Longitudinal weights https://iserredex.essex.ac.uk/support/issues/20742024-03-09T16:03:06ZJoe Mattock
<p>Hi,</p>
<p>I'm conducting an analysis specifically over waves 2, 3, 6 and 9 for Understanding Society, as relating to the voteintent variable which is only included in these waves. I would just like to ask about the weighting procedure for this case. I am examining how an independent variable (gentrification, as measured by an index) affects voting intention at the LSOA-level.</p>
<p>My understanding is that I need to take the longitudinal weight from the final wave used in my analysis and apply it to all respondents (i_indscub_lw - I believe). However, given that my dependent variable of interest is not observed in consecutive waves, I wanted to ask whether this principle applies in the same way.</p>
<p>I also wanted to ask how this weighting would be applied in practice. I am slightly confused about the order of things. For example, would you remove all wave-specific prefixes, merge LSOA indicators with the Understanding Society data, and then apply the relevant weight for each respondent?</p>
<p>Much appreciated,</p>
<p>Joe</p> Understanding Society User Support - Support #2042 (Feedback): Survey Weights for Multi-Wave Pool...https://iserredex.essex.ac.uk/support/issues/20422024-01-29T09:47:05ZLisa Waddell
<p>Hello,</p>
<p>I have constructed an unconventional sample by pooling tab-delineated data files for SN 6614-Understanding Society: Waves 1-13, 2009-2022 and Harmonised BHPS: Waves 1-18, 1991-2009. I request your advice regarding these weights.</p>
<p>Sample Construction: Using the family matrix, I identify everyone in the sample with both a mother and a father pidp identified. Using all waves of data, I keep participants whose mother and father both responded when the participant was aged 10 or younger. I then filter by participants who responded at the age of 21 or older. These two filtering functions leave me with a sample of around ~2000 people between the ages 21-41, from BHPS and USoc samples. Due to my pooling of BHPS and USoc samples, when I follow the steps for constructing a tailored sample weight, I lose a substantial portion of my sample. For example, if I choose a base weight from Wave 1 of USoc, I lose the entire BHPS sample. If I choose a base weight from Wave 2 of USoc, I lose a substantial portion of the USoc sample.</p>
<p>Given how I construct my sample, do you have any advice on how I should be applying survey weights?</p>
<p>All the best,<br />Lisa</p> Understanding Society User Support - Support #2036 (Feedback): Understanding Society - weightshttps://iserredex.essex.ac.uk/support/issues/20362024-01-22T12:50:29ZValentina Di Iasio
<p>Good morning,</p>
<p>After reading the user guide and watch the short YouTube video, I am still confused on which are the correct weights I should select for my pooled cross-sectional analysis using Understanding Society.</p>
<p>I am using waves 6 and 9 for a pooled cross-section analysis. I would therefore being inclined in using the cross-sectional weights. However, when reading the user guide it says that cross-sectional weights should only be used when the analysis includes one wave only. I also read the paragraph on re-scaling the weights to use more waves to conduct cross-sectional analysis. However, I am not sure whether the described procedure would apply to my case since I don't have a year overlapping over the two waves (wave 6 goes from January 2014 to May 2016 while wave 9 goes from January 2017 to May 2019). Therefore I am not sure whether I should simply use cross-sectional weights, re-scale the cross-sectional weights somehow (maybe for the first 6 months of 2016 and 2019 only?), or exclude the first 6 months of the years 2016 and 2019. Or, if I am missing something and I should use longitudinal weights (in that case, since I am doing a pooled cross-section analysis, how should I deal with 0 weights?)</p>
<p>Thank you in advance</p>
<p>Valentina Di Iasio</p> Understanding Society User Support - Support #2031 (Feedback): Cross-Sectional Weighting Questions.https://iserredex.essex.ac.uk/support/issues/20312024-01-18T11:20:26ZIfraz Hussain
<p>Hi, I'm currently working on a cross-sectional study across waves to examine the proportion of children who live in couple-parent families where one parent reports any form of relationship distress.</p>
<p>I have three questions relating to weighting:</p>
<ul>
<li>From this analysis, I've seen changes to weighting across all previous waves and I would like to know what specifically led to the revisions?</li>
<li>Since I'm looking at participants across waves, I'm also interested in whether there is any attempt to mitigate attrition bias (e.g. changes to weighting)?</li>
<li>Given that I'm working with w_psnenui_xw weights for my study, Do you think this weighting is appropriate for examining this area of the USOC Survey data?</li>
</ul> Understanding Society User Support - Support #1982 (Resolved): reference person weights https://iserredex.essex.ac.uk/support/issues/19822023-10-12T11:19:12ZAmelia Wattsamelia.watts678@outlook.com
<p>Dear Olena/support team,</p>
<p>I'm selecting reference persons from households across waves to form a panel. Can the individual longitudinal weights for these respondents in the last wave be used as suboptimal weights in the analysis?</p>
<p>Many thanks, <br />Amelia</p> Understanding Society User Support - Support #1975 (Resolved): Weights - Cross-sectional Analysis...https://iserredex.essex.ac.uk/support/issues/19752023-09-19T09:55:04ZCaitlin Schmid
<p>Good morning,</p>
<p>Using the main survey, I aim to run a cross-sectional analysis on a number of variables to analyse sex differences between adults and their variation across Local Authority Districts. To increase the sample sizes, I want to pool UKHLS Waves 11 and 12. Do I require tailored weights or can I proceed with the two provided cross-sectional adults weights of the respective waves (_indinui_xw)?</p>
<p>Many thanks and best wishes,</p>
<p>Caitlin</p> Understanding Society User Support - Support #1902 (Resolved): weights individual files waves 10 ...https://iserredex.essex.ac.uk/support/issues/19022023-05-15T13:20:37ZAelen Valen
<p>Hi,</p>
<p>I am trying to merge individual files across waves 10 and 11 into wide format to create a 2019 calendar year dataset.<br />I used this method from "Box 1: Example syntax for pooled analysis for cross-sectional estimation relating <br />to calendar year 2011, with weight re-scaling" in <a class="external" href="https://www.understandingsociety.ac.uk/sites/default/files/downloads/documentation/user-guides/mainstage/weighting_faqs.pdf">https://www.understandingsociety.ac.uk/sites/default/files/downloads/documentation/user-guides/mainstage/weighting_faqs.pdf</a></p>
<p>ge wts=0 <br />replace wts=indpxui_xw if month>=13 & month<=24 <br />ge ind=1 <br />sum ind [aw=indpxui_xw] if month>=1 & month<=12 <br />gen jwtdtot=r(sum_w) <br />sum ind [aw=indpxui_xw] if month>=1 & month<=12 <br />gen kwtdtot=r(sum_w) <br />replace wts=indpxui_xw*(jwtdtot/kwtdtot) if month>=1 & month<=12</p>
<p>For the purpose of the research I am working on, I am using the equivalised household income and other variables referring to parental occupation, education and place of birth.</p>
<p>Since I am using it together with EUSILC 2019 for different EU countries, I was comparing the weights with the weights in EUSILC. While the sum of the weights in the latter equals on average the 80% of the real population in each country, the sum of weights of the dataset I created for UK 2019 (with the merge of wave 10 and 11) gives a number way lower than the census 2019 UK population.</p>
<p>Could you please help me understanding how those weights are constructed, which characteristics of the population they consider, whether they can comparable to ones in EUSILC and whether the procedure I followed to merge the two waves is correct. <br />Many thanks in advance for the support!</p> Understanding Society User Support - Support #1890 (In Progress): Extracting PSU and Individual-L...https://iserredex.essex.ac.uk/support/issues/18902023-04-11T16:35:50ZLaurence Rowley-Abel
<p>Dear Understanding Society team,<br />I am running a multilevel model using individuals nested within census areas (LSOAs) in Waves 9, 10, 11 and 12. To account for clustering I am using the following levels in my multilevel model: individuals at the first level, PSUs at the second level and LSOAs at the third level. Therefore, from the provided weights, I need to extract separate weights for individuals and for PSUs. Having read your response here [[<a class="external" href="https://iserredex.essex.ac.uk/support/issues/1572">https://iserredex.essex.ac.uk/support/issues/1572</a>]], I am wondering if the below would be the correct approach:</p>
<p>- For the individual level, I would divide l_psnenus_xw by a_psnenus_xd (from the l_indall.dta and the a_indall.dta files respectively)<br />- For the PSU level, I would use a_psnenus_xd (from the a_indall.dta file)<br />- For the LSOA level, I would not be able to calculate a weight as it is not part of the sampling design. I would set this weight to 1 for all respondents.</p>
<p>Would this be correct? Additionally, would this mean I could only include respondents who were included at Wave 1, since I need to use the design weight (a_psnenus_xd) from Wave 1?</p>
<p>Many thanks for your help.</p>
<p>Best wishes,<br />Laurence</p> Understanding Society User Support - Support #1827 (Resolved): Correct weights to usehttps://iserredex.essex.ac.uk/support/issues/18272022-12-07T09:40:16ZAmelia Wattsamelia.watts678@outlook.com
<p>Dear Olena,</p>
<p>I have two questions regarding using weights. I’m trying to conduct a cross-sectional analysis using data from a UKHLS wave.</p>
<p>1) If I select a sub-sample using respondents interviewed in certain years/months within a wave, can I still use the existing cross-sectional weights, or will I need to make adjustments to the cross-sectional weights?</p>
<p>2) All the dependent and independent variables are from one wave (eg wave 5), apart from one independent variable which was measured at an earlier wave (eg wave 2). I will match respondents from wave 5 and wave 2 to obtain the values of this independent variable. In this case, can I still use the cross-sectional weights in wave 5, or should I use the longitudinal weights in wave 5?</p>
<p>Thank you for your help.</p> Understanding Society User Support - Support #1794 (Resolved): Weights https://iserredex.essex.ac.uk/support/issues/17942022-10-27T09:34:20ZCaroline Kienast von Einem
<p>Hi team,</p>
<p>I have a question about applying survey weights: <br />When I am applying the weights do I have to do this via specific commands e.g. svy prefix in stata or is it also possible to multiply my varaible of interest with the weight to create a new weighted variable that I could then use alongside commands that cannot be combined with survey weights directly.</p>
<p>Thank you for your help.</p>
<p>Best wishes, <br />Caroline</p> Understanding Society User Support - Support #1743 (Resolved): Averaging regional data to obtain ...https://iserredex.essex.ac.uk/support/issues/17432022-08-05T10:14:30ZCarolin Schmidtcs2100@cam.ac.uk
<p>Hi there,</p>
<p>I am using wave 6 to study household heads' homeownership probabilities. I am looking at native Brits and immigrants (I came up with an immigrant dummy for every household head).</p>
<p>I would now like to generate a control variable for each of my household heads: the variable should reflect the proportion of immigrants in the UK region where the person resides (that is, every household head in e.g London will have the same immigrant share attached, etc.). I am wondering how I should calculate that average: does it have to be weighted (i.e. egen immishare = wtmean(immigrant), weight(indscui_xw) by(region) using the gwtmean package which calculates weighted statistics)? I would think so, because without weighting it, I would have an average immigrant share based on the (not-per-se representative) raw data. However, if I calculate a weighted mean, then I would effectively double-weight the data because the regression itself would be weighted too, no?</p>
<p>I am unsure how to proceed and would appreciate any help.</p>
<p>Best wishes,<br />Carolin</p> Understanding Society User Support - Support #1624 (Resolved): Weights for subsamplehttps://iserredex.essex.ac.uk/support/issues/16242022-01-06T14:49:27ZAshley Burdett
<p>Hello,</p>
<p>I am trying to estimate the fraction of people that transition to their first relationship (cohabitation or marriage) by age using the BHPS.</p>
<p>To do this I have constructed an unbalanced panel containing observations for individuals who have never had a relationship (marriage or cohabitation) before. Precisely I use observations for individuals that did not report a relationship in the marital history datasets but provided a full response to the wave 2 main survey. I also include observations for individuals that aged into the sample during the panel to increase my sample size.</p>
<p>I include observations for these individuals up until either they form their first relationship, they have a missing observation or the survey ends (2008).</p>
<p>Using this sample, I simply calculate the fraction of individuals observed at each age that transition to their first relationship at that given age.</p>
<p>My question is how do I appropriately incorporate weights into this analysis? I have tried numerous ways of approaching this problem and get very different results each time.</p>
<p>Many thanks in advance for your help.</p>
<p>All the best,</p>
<p>Ashley</p> Understanding Society User Support - Support #1239 (Resolved): Using weights on a subsample of UKHLShttps://iserredex.essex.ac.uk/support/issues/12392019-09-07T12:31:55ZAmanda Moorghen
<p>Hi,<br />I am running analysis (logit) on a subsample of UKHLS - wave 6 only, people under the age of 30.</p>
<p>I am using the following weights: <br />svyset f_psu [pweight=f_indinui_xw], strata (f_strata) singleunit(centered)</p>
<p>I wanted to check that this was the correct approach? I am unsure whether the weights should be used in the same way for a subsample of UKHLS as if you were analysing the full sample.</p>
<p>Thanks<br />AM</p> Understanding Society User Support - Support #987 (Resolved): Weighting of sub-samplehttps://iserredex.essex.ac.uk/support/issues/9872018-06-26T22:51:48ZAnte Bab2242@cam.ac.uk
<p>Dear Sir or Madam,</p>
<p>I would like to compare the means of several variables of a sub-sample (e.g. income, education) after data cleansing with those of the initial sample to test for representativeness of the sub-sample. If all variables are from the same wave (i.e. wave 4 of the UKHLS), cross-sectional weights can be applied. However, the sub-sample contains two variables that were not surveyed in wave 4, so they were carried forward from wave 1 and 3. Should in this case the variables for the comparison be weighted with the longitudinal weights of the last wave (i.e. wave 4) or should cross-sectional weights be used (i.e. cross-sectional weights from wave 1 and 3 for the two carried-forward variables and for the remaining variables, cross-sectional weights from wave 4)? The variables are from household level questionnaires and self-completion interviews, so that the lowest level of hierarchy is 1, which would suggest to use d_indscus_lw if longitudinal weights are appropriate? Do you agree?</p>
<p>Thank you for your help.</p>
<p>Best regards<br />Ante</p>