Understanding Society User Support: Issueshttps://iserredex.essex.ac.uk/support/https://iserredex.essex.ac.uk/support/support/favicon.ico?15995719382024-02-27T13:21:37ZUnderstanding Society User Support
Redmine Understanding Society User Support - Support #2060 (Resolved): Design weights taken account of in...https://iserredex.essex.ac.uk/support/issues/20602024-02-27T13:21:37ZRosie Cornish
<p>I think the answer to this is yes, but can you confirm that the household enumeration weights (e.g. a_hhdenus_xw) take account of the design weights - i.e. they are the product of the design weight and a household response weight?</p> Understanding Society User Support - Support #1902 (Resolved): weights individual files waves 10 ...https://iserredex.essex.ac.uk/support/issues/19022023-05-15T13:20:37ZAelen Valen
<p>Hi,</p>
<p>I am trying to merge individual files across waves 10 and 11 into wide format to create a 2019 calendar year dataset.<br />I used this method from "Box 1: Example syntax for pooled analysis for cross-sectional estimation relating <br />to calendar year 2011, with weight re-scaling" in <a class="external" href="https://www.understandingsociety.ac.uk/sites/default/files/downloads/documentation/user-guides/mainstage/weighting_faqs.pdf">https://www.understandingsociety.ac.uk/sites/default/files/downloads/documentation/user-guides/mainstage/weighting_faqs.pdf</a></p>
<p>ge wts=0 <br />replace wts=indpxui_xw if month>=13 & month<=24 <br />ge ind=1 <br />sum ind [aw=indpxui_xw] if month>=1 & month<=12 <br />gen jwtdtot=r(sum_w) <br />sum ind [aw=indpxui_xw] if month>=1 & month<=12 <br />gen kwtdtot=r(sum_w) <br />replace wts=indpxui_xw*(jwtdtot/kwtdtot) if month>=1 & month<=12</p>
<p>For the purpose of the research I am working on, I am using the equivalised household income and other variables referring to parental occupation, education and place of birth.</p>
<p>Since I am using it together with EUSILC 2019 for different EU countries, I was comparing the weights with the weights in EUSILC. While the sum of the weights in the latter equals on average the 80% of the real population in each country, the sum of weights of the dataset I created for UK 2019 (with the merge of wave 10 and 11) gives a number way lower than the census 2019 UK population.</p>
<p>Could you please help me understanding how those weights are constructed, which characteristics of the population they consider, whether they can comparable to ones in EUSILC and whether the procedure I followed to merge the two waves is correct. <br />Many thanks in advance for the support!</p> Understanding Society User Support - Support #1894 (Resolved): Weight for unbalanced and merged U...https://iserredex.essex.ac.uk/support/issues/18942023-04-21T14:30:20ZYanan Zhangzhangyanan0918@gmail.com
<p>Dear Sir/Madam,</p>
<p>I hope this message finds you in good health and high spirits.</p>
<p>I am currently working with individual-level data from the merged Waves 1-18 of the BHPS and Waves 1-8 of the UKHLS datasets. I have a couple of questions regarding the use of weights in my analysis. I would appreciate any guidance you could provide.</p>
<p>1. In my study, I am employing fixed effects estimates to analyze the relationship between two variables, x and y. Given this approach, is it necessary to apply weights to the analysis?</p>
<p>2. I have followed the guidelines and used the longitudinal weight provided in Wave 8 of the UKHLS. However, I understand that this weight is applicable only to those who have participated in all waves. Since many individuals have only participated in parts of the waves, I am unsure how to generate weights for these participants. Could you please advise on the appropriate way to handle this situation?</p>
<p>Thanks for your time!</p> Understanding Society User Support - Support #1886 (Resolved): Weighting unemployment duration datahttps://iserredex.essex.ac.uk/support/issues/18862023-03-29T13:18:44ZFergus Jimenez-England
<p>Dear Team,</p>
<p>I am investigating the relationship between online job search and unemployment durations by pooling waves 1 to 11 of the UKHLS. In order to calculate unemployment durations, I use employment status data from the wave of the relevant unemployment spell to identify the start date and I use employment status data from the wave immediately after to identify the end date of unemployment. Hence, a given observation of unemployment duration will contain data obtained from at least two waves.</p>
<p>Would I be able to use cross sectional weights since the treatment (online job search) is observed cross sectionally even though unemployment durations must use data from multiple waves to obtain an end date?</p>
<p>Thanks,<br />Fergus</p> Understanding Society User Support - Support #1865 (Resolved): Changes to USOC wave data download...https://iserredex.essex.ac.uk/support/issues/18652023-02-23T16:42:31ZWilliam Shufflebottom
<p>Hi,</p>
<p>QUESTIONS</p>
<p>Q1: indscub_xw weight from wave 6 of USOC is present in our historical download of the wave 6 data but appears to be missing in the version of wave 6 we downloaded from UKData Service a few months ago and is also not listed as being in wave 6 on the USOC variable search page - can we confirm why only the indscui_xw weight is in the latest Wave 6 version, confirm it was in the original release, and if/when (and if so why) it was removed?</p>
<p>Q2: Our estimates run on the latest download of wave 1 to 12 of USOC are producing different numbers from the estimates we ran at the time of the previous wave's releases. Has there been a change to the data or weights (beyond wave 6 having a different weight) or how the weights work that could explain the difference we are seeing for all waves (bar wave 1 and wave 12) in a recent download of the data from all the waves. We are using the same weight (bar wave 6) and the same variable (sclfsat_7 in this case - but we use a range of USOC variables in our analysis).</p>
<p>BACKGROUND</p>
<p>We are producing estimates for the OECD and just discovered some differences for the estimates and CIs for the sclfsat7 variable when we re-ran historical estimates for all USOC waves 1 to 12. We run breakdowns for this variable (and others) by various domains when we update our publications and a new USOC wave has been released so we have the estimates from previous runs made at the time of USOC wave data release. We only ran the sclfsat7 variable again recently so there may be other changes.</p>
<p>We have a document for the weights to use for each variable which states that the indscub_xw weight is the correct weight to use for the sclfsat_7 variable in wave 6 but we noticed it was "missing" in the wave 6 data we downloaded around November from UK Data service (instead indscui_xw is present). As we are getting differences in our estimates and CIs for all waves (bar wave 1 and 12), this has prompted us to check with you if there have been changes made to the versions of the USOC main study wave data currently on the UK Data Service compared to what would have been available at the time each wave's data was released which could explain the differences we are seeing.</p>
<p>Your help is greatly appreciated as this has the potential to impact a lot of our publications and the current ad hoc we are working on</p> Understanding Society User Support - Support #1852 (Resolved): Select the correct weighting valueshttps://iserredex.essex.ac.uk/support/issues/18522023-02-07T17:53:18ZYushi Bai
<p>Dear colleagues,</p>
<p>I'm a post-doc research associate at the University of Manchester. We're currently planning an analysis investigating how mental health problems spread within a family network using your data (thank you for providing such an excellent dataset!). However, we're confused about how to create the correct weighting on our data even after reading all the tutorial materials. So I sincerely hope we can have your support for our analysis. I will first brief you on our initial analytical plan:</p>
<p>1. Formulate an initial participant pool consisting of all data in waves 1, 3, 5, 7, 9, and 11, because the Strengths and Difficulties Questionnaire (SDQ) data are available in those waves.<br />2. Within this initial pool, compare the data quality for each family across the waves (e.g. compare the quality of SDQ data for family A in waves 1, 3, 5, 7, 9, and 11).<br />3. Select a particular dataset for each family if the dataset has the fewest missing values across the waves, and formulate a large cross-sectional dataset. For example, if SDQ data have the fewest missing values for family A in wave 1, and for family B in wave 3, we use data for family A from wave 1, and data for family B from wave 3 to formulate a cross-sectional dataset.</p>
<p>By doing so, we hope we can boost our sample size and the quality of the data. This is because our analytical approach (network analysis) requires highly on data quality. However, we're aware that this participant selection approach may introduce bias. Therefore, we're wondering whether you can suggest whether our participant selection plan is reasonable in the light of your research design, and if so, what materials we can use to create the correct weighting values for our data?</p>
<p>Thank you in advance for your time and help, and we're looking forward to hearing from you.</p>
<p>Kind regards,<br />Yushi</p> Understanding Society User Support - Support #1820 (Resolved): Cross-sectional vs longitudinal we...https://iserredex.essex.ac.uk/support/issues/18202022-11-29T08:17:45ZHenrique Neves
<p>Dear Understanding society support team,</p>
<p>Our research team is using data from the Understanding Society Main Annual Survey (waves 7 to 11) and the COVID-19 study (waves 1 to 9). In our analysis, we want to account for weights. However, we are unsure about which weighs to use.</p>
<p>Our main goal is to analyze gaps in mental health ( <em>scghq1_dv</em> ) between a Muslim and a Non-Muslim population during COVID-19. We rely on a standard difference-in-differences design, comparing the average Muslim-Non-Muslim gaps in mental health during the pandemic (Covid Survey, waves 1 to 9) with the average pre-pandemic gaps (Main Survey, waves 7 to 11). Our treatment variable takes value 1 for Covid waves and 0 otherwise. Additionally, we run an event study design, comparing Muslim-Non-Muslim gaps in mental health in each wave (Waves 8 to 11 of the Main Survey and Covid waves) relative to wave 7 of the Main Annual Survey.</p>
<p>Unfortunately, the COVID-19 study does not ask about the participants' religion. To identify Muslims in the COVID-19 dataset we extract the last religion status reported on the Understanding Society Main Survey (based on the variable <em>oprlg1</em> ) and link it with the Covid Survey through person identifiers.</p>
<p>Given our study design would you recommend we use cross-sectional or longitudinal weights?</p>
<p>Thanks in advance for your help!</p>
<p>Kind regards,<br />Henrique</p> Understanding Society User Support - Support #1777 (Resolved): Creating Longitudinal Weights for ...https://iserredex.essex.ac.uk/support/issues/17772022-10-06T16:31:47ZJoAnn Tan
<p>I have a question similar to <a class="issue tracker-3 status-3 priority-4 priority-default" title="Support: Weight for unbalanced UKHLS panel data (Resolved)" href="https://iserredex.essex.ac.uk/support/issues/1257">#1257</a>.</p>
<p>In <a class="issue tracker-3 status-3 priority-4 priority-default" title="Support: Weight for unbalanced UKHLS panel data (Resolved)" href="https://iserredex.essex.ac.uk/support/issues/1257">#1257</a>, Alita mentioned that we can create longitudinal weights for unbalanced panel. How exactly can I do that? I am quite certain that my analysis (exploring the probability of being in temporary employment) is nothing complicated and hence does not require creating my own weights. However, I really want to run a longitudinal analysis with an UNBALANCED panel. Please help, thanks! (P/S: I have read all previous posts on creating weights for unbalanced panel but I am still not sure how creating longitudinal weights for unbalanced panel could be done.)</p> Understanding Society User Support - Support #1747 (Resolved): Weight problem when running regres...https://iserredex.essex.ac.uk/support/issues/17472022-08-10T16:28:55ZParth Pandya
<p>When I try to do a regression I come up with the error in screenshot below. My weight values have to be different because of the way Understanding Society has set the weight up. How do I go around this problem? Thank you so much!</p> Understanding Society User Support - Support #1726 (Resolved): BHPS and Understanding Society - w...https://iserredex.essex.ac.uk/support/issues/17262022-07-13T16:58:28ZMaria Petrillo
<p>Hi,<br />I am using both the BHPS (wave 1-18) and the Understanding Society (wave 1-11) to conduct a descriptive analysis on episodes of caring over time. I would like to know what weights should I be using in this case of both a cross-section analysis and a longitudinal one. In case of a cross section analysis it seems to me that I can use xrwtuk1 for waves BH12 to BH18 and indinub_xw from wave 2 to 11. But what about all the other waves? Could you please let me know what is the best approach?</p> Understanding Society User Support - Support #1715 (Resolved): Longitudinal Weighting of Non-Move...https://iserredex.essex.ac.uk/support/issues/17152022-06-13T11:06:10ZSue Easton
<p>Hi, I have searched and can't find these key words in any posts.</p>
<p>Due to limitations of time I need to limit my analysis to individuals who have not changed location since they entered the survey in Wave 1 (UKHLS sample and any others in from Wave 1 with more than 1 wave).</p>
<p>This means some people's data will be right censored due to household moves.</p>
<p>How will this affect weighting?</p>
<p>Will I need to calculate new weights? As variables such as age are highly likely to be correlated with the "risk" of moving home.</p>
<p>Thanks.</p>
<p>Sue EAston</p> Understanding Society User Support - Support #1652 (Resolved): Should using longitudinal weights ...https://iserredex.essex.ac.uk/support/issues/16522022-02-05T08:57:24ZLucas Auer
<p>Dear Olena,</p>
<p>I have a question concerning the correct use of longitudinal weights. I want to perform longitudinal analysis in stata, starting in wave 6 and ending in wave 11, using information from the individual adult self-completion interview. I understand that I should therefore be using the weight k_indscui_lw.</p>
<p>When reading in my data, I create a panel data set in long format, removing the wave prefix from the variables and instead introducing a wave variable called UKHLSwave. Subsequently, I create a new weight variable corresponding to the respective individual’s value of k_indscui_lw for each observation (for later use in regression analysis) in the following way: <br />gen weight11_temp = indscui_lw if UKHLSwave == 11<br />by pidp: egen weight11 = max(weight11_temp)</p>
<p>My understanding from the weighting guidance (which I found very helpful – thanks a lot!) was that only individuals who gave a full interview at waves 6 through 11 should have a non-zero value of k_indscui_lw. In return, I expected to find my panel balanced for waves 6 through 11 once I condition on my new variable weight11 being non-zero and non-missing. However, when I do:<br />gen nonzeroweight = (weight11 > 0 & weight11 != .)<br />tab nonzeroweight UKHLSwave if UKHLSwave >= 6<br />I get strictly increasing numbers of observations from wave 6 through wave 11.</p>
<p>Can you please check if you can replicate my findings, and advise where I am doing/understanding something wrong? Otherwise, any insight into why it would be normal for the situation above to arise would be greatly appreciated.</p>
<p>Many thanks,<br />Lucas</p> Understanding Society User Support - Support #1624 (Resolved): Weights for subsamplehttps://iserredex.essex.ac.uk/support/issues/16242022-01-06T14:49:27ZAshley Burdett
<p>Hello,</p>
<p>I am trying to estimate the fraction of people that transition to their first relationship (cohabitation or marriage) by age using the BHPS.</p>
<p>To do this I have constructed an unbalanced panel containing observations for individuals who have never had a relationship (marriage or cohabitation) before. Precisely I use observations for individuals that did not report a relationship in the marital history datasets but provided a full response to the wave 2 main survey. I also include observations for individuals that aged into the sample during the panel to increase my sample size.</p>
<p>I include observations for these individuals up until either they form their first relationship, they have a missing observation or the survey ends (2008).</p>
<p>Using this sample, I simply calculate the fraction of individuals observed at each age that transition to their first relationship at that given age.</p>
<p>My question is how do I appropriately incorporate weights into this analysis? I have tried numerous ways of approaching this problem and get very different results each time.</p>
<p>Many thanks in advance for your help.</p>
<p>All the best,</p>
<p>Ashley</p> Understanding Society User Support - Support #1159 (Resolved): Weights for cross-sectional and lo...https://iserredex.essex.ac.uk/support/issues/11592019-03-13T11:33:48ZLuca Bernardiluca.bernardi@uab.cat
<p>Dear Understanding Society Support Team,</p>
<p>I am using data from adult main interviews from all waves. I am estimating the effect of depression on party identification. I am analyzing the data both cross-sectionally and longitudinally. However, I am unsure about which weight(s) to use, also given the low number of clinically depressed individuals. Is it correct that in both cases, since I am using data from more than one wave, I should use a longitudinal weight? Also, by reading the User Guide, in Wave 6 there is a change in the definition of the cross-sectional population represented. If this somehow complicates the issue, I have no problem with analyzing data only from Wave 1 to Wave 5. Could you please give me some recommendations?</p>
<p>Many thanks and best wishes,<br />Luca</p> Understanding Society User Support - Support #985 (Resolved): Weights for pooled cross-section ov...https://iserredex.essex.ac.uk/support/issues/9852018-06-22T10:51:51ZNhat An Trinh
<p>Hello,</p>
<p>Although this issue has already been discussed a couple of times, I would like to address the selection and use of the appropriate weights when pooling across all waves of Understanding Society once again to avoid any mistakes. I'm very much appreciating the guidance that has been provided so far, but haven't found a clear answer to my question and thus be extremely grateful if someone could help me out.</p>
<p>For my analysis of intergenerational social mobility across labour market entry cohorts, I am using all waves including all samples of Understanding Society in a pooled cross-section. Obviously, I have dropped all duplicates as I want to have each observation only once in my dataset and take the first interview in which the individual has indicated both her first occupation and year of leaving school/further education as my observation of interest. In line with [<a class="issue tracker-3 status-5 priority-4 priority-default closed" title="Support: weights for pooled cross-sections over waves (a)-(f) (Closed)" href="https://iserredex.essex.ac.uk/support/issues/758">#758</a>], I have constructed the individual cross-sectional weight as follows:</p>
<p>gen xweight = .</p>
<p>replace xweight = a_indpxus_xw if wave == 1<br />foreach x in b c d e {<br /> replace xweight = `x'_indpxub_xw if inlist(wave,2,3,4,5) <br />}<br />repalce xweight = f_indpxui_xw if wave 6<br />replace xweight = g_indpxui_xw if wave 7</p>
<p>Is this the correct way of selecting the cross-sectional weights? And do I need to do anything else such as rescaling to correctly apply them for my pooled cross-sectional analysis (i.e. calculating social mobility rates and proportions of class of origin and destination by labour market entry cohorts)?</p>
<p>Thank you very much!</p>
<p>Nhat An</p>