Understanding Society User Support: Issueshttps://iserredex.essex.ac.uk/support/https://iserredex.essex.ac.uk/support/support/favicon.ico?15995719382020-01-11T08:29:14ZUnderstanding Society User Support
Redmine Understanding Society User Support - Support #1298 (Resolved): Matching youth data to parent data...https://iserredex.essex.ac.uk/support/issues/12982020-01-11T08:29:14ZPaul Downward
<p>Dear colleague,<br />I wonder if you can help me with the above. I have used the USS before and merged and matched waves and, following your online course, matched adults in a household. I am now experimenting with matching individuals from the youth file to, say, their mothers and whist I can match the files I end up with very small matched samples and wonder if I am doing something silly.</p>
<p>To illustrate based on some reduced files - I have also been saving replacing the files as I go to check each step - as I have learned syntax as I address specific projects</p>
<p>If I create a 'mum' file with a few variables <br />use "C:\ukhls_w8\h_indresp.dta" <br />keep if h_sex==2<br />save "C:\ukhls_w8\mum data.dta",replace<br />use "C:\ukhls_w8\mum data.dta" <br />keep pidp h_sex h_hidp h_scsf1 h_mnspid h_pno h_childpno h_intdaty_dv h_dvage<br />drop if h_mnspid==-8<br />rename (h_sex h_hidp h_scsf1 h_mnspid h_pno h_childpno h_intdaty_dv h_dvage) (msex mhidp mscsf1 mnspid mpno mchildpno mintdaty_dv dvage)<br />save "C:\ukhls_w8\mum data.dta",replace</p>
<p>I then created a youth file with a couple of variables in</p>
<p>use "C:\ukhls_w8\h_youth.dta" <br />keep pidp h_mnspid h_ypsrhlth h_hidp h_dvage<br />drop if h_mnspid==-8<br />rename (h_mnspid h_ypsrhlth h_hidp h_dvage) (mnspid yypsrhlth yhidp ydvage)<br />save "C:\ukhls_w8\youth data.dta",replace</p>
<p>I have then tried an m:1 merge<br />Use C:\ukhls_w8\youth data.dta" <br />merge m:1 mnspid using "C:\ukhls_w8\mum data.dta" <br />save "C:\ukhls_w8\Total W8.dta",replace</p>
<p>I get the message<br />variable mnspid does not uniquely identify observations in the using data</p>
<p>So, I checked the duplicates in the mum file and I get</p>
<p>duplicates report, mnspid</p>
<p>Duplicates in terms of all variables</p>
<p>--------------------------------------<br /> copies | observations surplus<br />----------+---------------------------<br /> 1 | 2738 0<br />--------------------------------------</p>
<p>and in the youth file I get</p>
<p>Duplicates in terms of all variables</p>
<p>--------------------------------------<br /> copies | observations surplus<br />----------+---------------------------<br /> 1 | 3174 0<br />--------------------------------------</p>
<p>So there doesn't seem to be an issue with duplicates but, if I repeat the steps above and this time remove the duplicates by force</p>
<p>e.g. in the mum file<br />. duplicates drop mnspid, force</p>
<p>Duplicates in terms of mnspid</p>
<p>(413 observations deleted)</p>
<p>and I leave any duplicates in the youth file as I assume it makes sense that there are duplicates of mnspid as a parent can be shared.</p>
<p>If I now follow the m:1 merge above I get </p>
<pre><code>Result # of obs.<br /> -----------------------------------------<br /> not matched 4,458<br /> from master 2,598 (_merge==1)<br /> from using 1,860 (_merge==2)</code></pre>
<pre><code>matched 576 (_merge==3)<br /> -----------------------------------------</code></pre>
<p>This seems to be a very small subset of cases. Am I doing the right thing here? Any help would be greatly appreciated. My plan was to do this for each wave and then append the waves.</p>
<p>Thank you.</p>
<p>Paul.</p> Understanding Society User Support - Support #1114 (Resolved): Tracking health conditionshttps://iserredex.essex.ac.uk/support/issues/11142018-12-04T18:56:25ZPaul Downward
<p>Hi,</p>
<p>I am currently working with the harmonised BHPS/US data and looking to chart various health conditions. For example in the BHPS waves variable hlprbh identifies if the respondent has diabetes and this grows from just under 2% in wave 1 until about 5% in wave 18. Now as I move into the US data the condition is measured in Wave a as hcond14 and this rate is now just above 5%, which is fine.</p>
<p>However, from wave b the incidence drops to just over 1% using hcond14n as hcond is not asked. I think from the documentation that hcond14n is newly diagnosed people. So to get the total incidence Can I add these individuals to those identified from wave a? Moreover, when I then consider wave c and onwards it seems that hcond14 and hcond14n are both asked and it seems that from here hcond14 is for people never asked before (which I suppose are new entrants to the survey) and hcond14n are previously interviewed people but newly diagnosed. In these cases can I add both of these 'increments' in incidence to those from wave a that already have indicated the presence of the condition to get the total incidence.</p>
<p>Thank you for any assistance.</p>
<p>Paul.</p> Understanding Society User Support - Support #944 (Resolved): Lowest level reliability with spati...https://iserredex.essex.ac.uk/support/issues/9442018-03-23T09:39:01ZPaul Downward
<p>I am currently supervising a PhD student who is very interested in exploring the environmental influences on individual well-being and, consequently, undertaking some analysis of how spatial features might influence this. We would naturally apply to use the special license access data but could I ask an initial question about what is the lowest level that the study is designed to give reliable inferences. Thus, as we drill down are their spatial gaps in the sampling?</p>