Understanding Society User Support: Issueshttps://iserredex.essex.ac.uk/support/https://iserredex.essex.ac.uk/support/support/favicon.ico?15995719382024-03-26T16:09:20ZUnderstanding Society User Support
Redmine Support #2077 (Feedback): Using income variables https://iserredex.essex.ac.uk/support/issues/20772024-03-26T16:09:20ZMhairi Webster
<p>Hello,</p>
<p>I am looking to access the data to use the derived income variables (w_fimnnet_dv). Could you let me know under what access it is under on the UK Data Service as those variables don't appear to be available in the dataset I am using (Understanding Society: Waves 1-8, 2009-2017 and Harmonised <br />BHPS: Waves 1-18, 1991-2009. 11th Edition. UK Data Service. SN: 6614, <a class="external" href="http://doi.org/10.5255/UKDA-SN6614-12">http://doi.org/10.5255/UKDA-SN6614-12</a>).</p>
<p>Many thanks, <br />Mhairi Webster</p> Support #2076 (Feedback): Issues with xx_hadcvvac variables in COVID-19 data collectionhttps://iserredex.essex.ac.uk/support/issues/20762024-03-13T21:01:15ZLaura L
<p>Good evening,</p>
<p>I am currently analysing data from the <em>xx_indresp_w</em> datasets of the COVID-19 data collection, specifically from wave 9 (ci), wave 8 (ch) and wave 7 (cg). From the documentation, the questions <em>xx_hadcvvac</em> (about having received the COVID-19 vaccine in each survey wave) should be asked to respondents that have not already answered that they received 1 or 2 doses of vaccines in previous months (answer codes 1 and 2). However, by cross-tabulating the answers to the <em>xx_hadcvvac</em> questions for wave 7 and 9 for respondents present in wave 9 and 7 (left-joining the datasets by respondent ID <em>pidp</em>, i.e. matching all respondents in wave 9 with those that were also in wave 7):</p>
<p>table(ci_hadcvvac = wave_9$ci_hadcvvac, cg_hadcvvac = wave_9$cg_hadcvvac)</p>
<p>with <em>wave_9</em> the left-joined dataset, I obtain the following table:</p>
<pre><code>cg_hadcvvac<br />ci_hadcvvac -9 -8 -2 1 2 3 4<br /> -8 0 10 0 133 9 492 4835<br /> -2 2 0 2 0 0 0 4<br /> 1 0 0 0 4 1 1 133<br /> 2 0 3 1 <strong>1663 116</strong> 36 2538<br /> 3 0 0 0 0 0 1 5<br /> 4 0 0 0 2 0 3 322</code></pre>
<p>As you can see from the numbers in bold (took as examples), there are some respondents vaccinated in wave 7 that appear to be asked the question again in wave 9. Am I missing some information?</p>
<p>Thank you very much in advance for the support.</p>
<p>Best regards, <br />Laura</p> Support #2075 (Feedback): Using UKHLS to look at trends across calendar months https://iserredex.essex.ac.uk/support/issues/20752024-03-13T15:36:15ZJames Laurence
<p>Hi there,</p>
<p>I am interested in looking at calendar month trends in whether someone wants to move home or not (which is available in every wave): lkmove. Ideally, I would like to look at trends using all waves (1-13). However, if it is easier to look at trends from some other start point, e.g.. 2016 or 2017, then I am flexible. I am also flexible as to whether the BHPS sample is included or not. This will be cross-sectional analysis, so I hope to treat each calendar month as a cross-section (I won’t be doing any longitudinal analysis).</p>
<p>I have been reading the helpful notes on ‘Running analysis on a calendar year or month’ (<a class="external" href="https://www.understandingsociety.ac.uk/documentation/mainstage/user-guides/main-survey-user-guide/how-to-use-weights-analysis-guidance-for-weights-psu-strata/">https://www.understandingsociety.ac.uk/documentation/mainstage/user-guides/main-survey-user-guide/how-to-use-weights-analysis-guidance-for-weights-psu-strata/</a>). However, I just had some questions and was hoping to see if where I’d got to so far looked right.</p>
<p>I have been using the w_month and wave variables to generate a new date variable of year-month. To capture calendar year, I have used the wave and w_month variables in the following manner:</p>
<p>gen year = 2009 if wave==1 & (month>0 & month<13)<br />replace year = 2010 if wave==1 & (month>12 & month<25)<br />replace year = 2010 if wave==2 & (month>0 & month<13)<br />replace year = 2011 if wave==2 & (month>12 & month<25)<br />replace year = 2011 if wave==3 & (month>0 & month<13)<br />…<br />replace year = 2021 if wave==13 & (month>0 & month<13)<br />replace year = 2022 if wave==13 & (month>12 & month<25)</p>
<p>To measure calendar month, I have recoded the w_month variable, combining the two monthly measures into one. So, in the w_month variable, it tells us whether someone was sampled in January in the year 1 sample or January in the year 2 sample. I’ve now combined these into a single category of whether someone was sampled in January. For example, ‘jan yr1’ and jan yr2’ are now just ‘jan’; ‘feb yr1’ and ‘feb yr2’ are now just ‘feb, etc.</p>
<p>With these new calendar year and calendar month variables, I have now created a new measure of calendar year-month, which looks like this (I hope this is correct so far):</p>
<pre><code>2009 Jan = 1<br /> 2009 Feb = 2<br /> 2009 Mar = 3<br /> 2009 Apr = 4<br /> 2009 May = 5<br /> 2009 June = 6<br /> 2009 July = 7<br />…<br /> 2022 June = 162<br /> 2022 July = 163<br /> 2022 Aug = 164<br /> 2022 Sep = 165<br /> 2022 Oct = 166<br /> 2022 Nov = 167<br /> 2022 Nov = 168</code></pre>
<p>I understand that whatever weight I choose to use I need to correct it due to Northern Ireland only being sampled in issue month 1-12 (and not 13-24). Therefore, I will apply the following adjustment to the weight (gen adj=1, replace adj=0.5 if w_country==4, gen weight=w_xxxyyus_lw*adj 8) as outlined in the online notes.</p>
<p>However, where I’ve become a little lost is what weights to initially use. In the notes, it states due to exceptions in sample selection ‘we recommend use of the us_lw weight in analysis’. Given my intention to look at calendar months up to wave 13, does this mean I should use the m_indpxus_lw weight? Is this the case, even if I just want to look at the data cross-sectionally (treat every calendar month as a cross-sectional picture of lkmove)? Because it seems that if I use m_indpxus_lw then it substantially reduces the sample size (due to these longitudinal weights requiring someone to have participated in every wave). Is it possible to use the cross-sectional weights for my aims, while excluding the BHPS and IEMB, as is suggested that one needs to do for this kind of calendar month analysis in the online notes? Or, do I need to use longitudinal weights for my intended analysis?</p>
<p>I was also just trying to get my head around the issue of scaling discussed in the online notes: ‘The weights provided are not designed directly for pooling data across waves as they are scaled to a mean value of 1.0 within each wave, and therefore produce different weighted sample sizes in each wave’, under the section ‘Pooling data from different waves for cross-sectional analysis.’ Firstly, I just wanted to confirm this applies to my case of doing monthly trends?</p>
<p>And secondly, if so, from what I can see, the syntax kindly provided is intended to produce an accurate weight to look at the variable jbstat for the calendar year 2011, using months 13-24 of wave 2 and 1-12 of wave 3. At the end, we get the weight variable weight2011, to use for weighting calendar year 2011. In my situation, I would like to do a longer running trend of values of lkmove by months. Would I need to create these weights for each calendar year I look at? So, for 2014, I would need to create a new cross-sectional weight using e_indpxub_xw and f_indpxub_xw (waves 5 and 6). For 2015, I would need to create a new cross-sectional weight using f_indpxub_xw and g_indpxub_xw (waves 6 and 7). For 2016, I would need to create a new cross-sectional weight using g_indpxub_xw and h_indpxub_xw (waves 7 and 8). And to follow this all the way to my last calendar year. Then, to look at monthly trends, treating the data as pooled cross-sectional, I would have my data in long-format and have a new weight variable made up of all these new calendar year weights I’ve created?</p>
<p>I was also wondering if it would be possible to include monthly lkmove data from the calendar year 2022 (using wave 13 of the UKHLS mainstage). As I understand things, previous calendar years (e.g., 2018) are composed of samples from two waves (waves 9 and 10 of the mainstage). However, for the calendar year of 2022, it is only composed of the sample from wave 13. Is it still possible to look at calendar month trends in lkmove for 2022? If so, would I need to make other sample restrictions to the other calendar years, for example, drop the IEMB sample from the trends? And would I need to make other adjustments to the weights? Or, is it not possible yet to look at monthly trends until wave 14 comes out)? I think from the online notes this is mentioned: ‘The analysis sample is only representative when all 24 monthly samples are combined in equal measure.’ Does this point refer to my question?</p>
<p>I am also interested in potentially looking at quarterly trends (Jan-Mar, Apr-Jun, etc.), instead of monthly trends (using the x_quarter variable). To do so, can I take the same approach as above? So, create a new time variable which is years divided into quarters (e.g., 2013 Jan-Mar, 2013 Apr-Jun, 2013 July-Sep, 2013 Oct-Dec, 2014 Jan-Mar, 2014 Apr-June…2022 Jul-Sep, 2022 Oct-Dec). Do I need to do anything different with the weights?</p>
<p>I hope this all makes sense.</p>
<p>Thanks so much in advance.</p>
<p>James</p> Support #2074 (In Progress): Longitudinal weights https://iserredex.essex.ac.uk/support/issues/20742024-03-09T16:03:06ZJoe Mattock
<p>Hi,</p>
<p>I'm conducting an analysis specifically over waves 2, 3, 6 and 9 for Understanding Society, as relating to the voteintent variable which is only included in these waves. I would just like to ask about the weighting procedure for this case. I am examining how an independent variable (gentrification, as measured by an index) affects voting intention at the LSOA-level.</p>
<p>My understanding is that I need to take the longitudinal weight from the final wave used in my analysis and apply it to all respondents (i_indscub_lw - I believe). However, given that my dependent variable of interest is not observed in consecutive waves, I wanted to ask whether this principle applies in the same way.</p>
<p>I also wanted to ask how this weighting would be applied in practice. I am slightly confused about the order of things. For example, would you remove all wave-specific prefixes, merge LSOA indicators with the Understanding Society data, and then apply the relevant weight for each respondent?</p>
<p>Much appreciated,</p>
<p>Joe</p> Support #2073 (Feedback): Data filehttps://iserredex.essex.ac.uk/support/issues/20732024-03-08T16:20:51ZLuisa Edwards
<p>What is the date period for the Wave 13 indresp file, and was this post Covid.<br />I.e would it be possible to compare a monthly COVID data set to the whole of Wave 13, looking at post Covid environment?</p> Support #2072 (Feedback): tracking spouse after the household dissolutionshttps://iserredex.essex.ac.uk/support/issues/20722024-03-06T20:57:10ZSeok Woo Kwonkwonsw@gmail.com
<p>Hi, I am wondering if the study follows ALL household members after household dissolutions and not just the household head.<br />Thanks for your help in advance.</p> Support #2071 (Feedback): Choosing weight where some variables are self-completion and others are...https://iserredex.essex.ac.uk/support/issues/20712024-03-06T17:23:35ZMolly Rowe
<p>Hi,</p>
<p>I am trying to select the correct weight, and am having some difficulty with what to choose for the instrument/which question(aires) part. I believe that some of my variables were part of the self-completion section of the questionnaire, while others were obtained by interview (if that's possible?). Therefore, I'm not sure whether to select self-completion (sc) or interview (in) for the Yy part of the weighting.</p>
<p>Any help would be greatly appreciated!</p> Support #2070 (Feedback): Creating Chronology when using COVID-19 and main panel datahttps://iserredex.essex.ac.uk/support/issues/20702024-03-06T11:29:09ZIsaac Hance
<p>I want to create a variable in stata that allows me to ensure I am viewing each individuals responses in order, when using the COVID panel and main panel merged together. Because of the overlap in waves - such as some COVID panels being during the data collection period of multiple main survey waves, this is complex. I am planning to merge _intdatem and _intdatey, but cannot seem to do it in a way that lets me sort.</p> Support #2069 (Feedback): Match children information with parental informationhttps://iserredex.essex.ac.uk/support/issues/20692024-03-06T08:17:49ZGiovanni Greco
<p>Good morning Users.<br />For my Master thesis, I am using data on children, and I need parental information (their household ID and their household income) to be matched into childrens data. Probably the family matrix is of great help, but I am struggling to figure out how to do it in Stata.<br />Has there been anyone with a similar challenge?<br />Thank you in advance.</p> Support #2068 (Feedback): Wave 16 biological datahttps://iserredex.essex.ac.uk/support/issues/20682024-03-05T12:49:48ZEleanor Winpennyew470@cam.ac.uk
<p>Hi,<br />Do you have any estimate when the wave 16 biological data is likely to be released? This would be really helpful to know whether I can include it in a grant application.</p> Support #2067 (Feedback): data accesshttps://iserredex.essex.ac.uk/support/issues/20672024-03-04T17:02:55Zmike polkey
<p>Dear UKHLS</p>
<p>I am writing from the Royal Brompton Hospital (now part of GSTT) and imperial College in London</p>
<p>We would very much like to access the UKHLS database to extend a research theme that began 2 years ago. In the first stage we have developed a model which predicts the likelihood of having obstructive sleep apnoea. In the next stage we wanted to relate it to economic activity but HSE data only gives this at an occupation or industry level but our reading of your published papers is that you have individual level data; the other data we would need would be PSQI, ht, wt, h/o cardiovascular disease, age gender and occupation.</p>
<p>I reviewed the FAQ; the lead student here did his MSc with us at Imperial but has now returned to Japan so would not be able to analyse the data in the UK</p>
<p>Please do reply by phone if easier; 07801553468</p> Support #2066 (Resolved): Code creator E-Mailhttps://iserredex.essex.ac.uk/support/issues/20662024-03-04T16:25:06ZLeo Haentjes
<p>Hello, after multiple tries and hours of waiting time I still do not get an e-mail containing my code when I try to use the code creator. I would be very thankful if you could look into it and help me solve the problem.<br />Best regards</p> Support #2065 (Feedback): How to manage longitudinal data analysis after excluding sample based o...https://iserredex.essex.ac.uk/support/issues/20652024-03-04T15:09:36ZMarina Kousta
<p>I am conducting a (longitudinal) diff-in-diff analysis for a policy evaluation where the date of policy introduction is important. I have a few questions below:</p>
<p>1) As my date of interest falls in the middle of a single wave, I could split up wave X into two parts indicating the before and after. Is this enough so that I can only use a single wave for the analysis, OR would you say it is preferable that I also use more waves to more accurately represent the year for the before and after treatment? ( the reason i am asking is because i read the following on your website: "As some samples are fielded in the first 12 months (BHPS and General Population-Northern Ireland samples), some in months 13-24 (IEMB sample) and some across all 24 months (General Population-Great Britain and EMB samples), just using data from the same wave to compare the two consecutive years will result in comparing different samples. Similarly, just using data from year 1 or year 2 of a wave to conduct cross-sectional analyses of that year will result in analysing samples that are not-representative. So, to correctly do these types of analyses, data from two waves need to be combined. For example, for 2019, use data from year 2 of Wave 10 and year 1 of Wave 11."</p>
<p>2) To split up any given wave into two separate waves etc, which variable would you recommend? I have seen many variables in the dataset indicating the month of interview, year, etc but there are also others relating to the sample, but I am unsure which variable would be the most accurate? Moreover, I am confused as some waves suggest they may extend across three calendar years but when I look at the year of interview variable, it only reflects year 1 and year 2, there is no mention of year 3.</p>
<p>3) Which weights would you recommend using in this case?</p>
<p>Many thanks in advance for any help you can provide.</p>
<p>Best,<br />Marina</p> Support #2064 (Feedback): calendar year dataset - longitudinal analysishttps://iserredex.essex.ac.uk/support/issues/20642024-03-04T14:44:42ZMarina Kousta
<p>I am conducting an analysis for which I need to use the provided calendar year datasets. I have the following questions:<br />1) You state on the website that the calendar year datasets are not intended for longitudinal analysis; why is that, and, is there a way to overcome this? (asking as I want to conduct a longitudinal analysis)<br />2) Do you also recommend avoiding longitudinal analysis when we manually construct the calendar year datasets ourselves (by merging the waves)?<br />3) If I go ahead with either 10 or 20, would you recommend avoiding to use the provided longitudinal weights?</p>
<p>Many thanks in advance for your help.</p>
<p>Best wishes,<br />Marina</p> Support #2063 (Feedback): List of Validated Measures or Scales used in the Study? https://iserredex.essex.ac.uk/support/issues/20632024-03-04T14:30:12ZLuke DeCoste
<p>Hi, I'm wondering if there is a list of validated measures that are used in the study? Specifically I'm wondering how to identify questions that should be grouped together to produce a specific construct.</p>
<p>e.g. One variable is the 7-item short version of the Warwick-Edinburgh Mental Well-being Scale (WEMWBS, see Tennant et al., 2007).</p>
<p>Are there other scales that have been assembled intentionally that we can somehow identify? e.g. I'm using a number of variables related to sleep (hours of actual sleep (hrs) ghq: loss of sleep quality of sleep overall cannot get to sleep within 30 mins wake up in the night). Is there research behind the use of these variables that can be found? I have similar questions regarding variables related to sleep, exercise, etc.</p>
<p>I'm wondering where I find a list of scales used in the study.</p>
<p>Thanks.</p>