I've been looking at summary statistics for fimnlabgrs_dv and have noticed that individuals with no top-coding (fimnlabgrs_tc) can have higher incomes than those with top-coding. How is this the case? Is fimnlabgrs_dv split into components which are top-coded separately before being aggregated?
Updated by Alita Nandi about 1 year ago
- Status changed from New to In Progress
- Assignee set to Leilah Plant-Tchenguiz
- % Done changed from 0 to 50
Could please you add which wave you are looking at? I did not find this issue when I looked at Wave 1 data.
The mean a_fimnlabgrs_dv for those whose incomes were topcoded (a_fimnlabgrs_tc=1) was higher than those whose incomes were not topcoded (a_fimnlabgrs_tc=0)
use pidp fimnlabgrs using a_indresp", clear
mvdecode _all, mv(-9/-1)
tabstat a_fimnlabgrs_dv, by(a_fimnlabgrs_tc)
Understanding Society User Support Team
Updated by Leilah Plant-Tchenguiz about 1 year ago
I used the following Stata code to summarise by wave and topcoding: bysort wave fimnlabgrs_tc: su fimnlabgrs_dv
I then saw that for wave 1, the maximum labour income (fimnlabgrs_dv) of the non-top-coded (fimnlabgrs_tc=0) was higher than the minimum of the top-coded (fimnlabgrs_tc=1).
Updated by Alita Nandi 12 months ago
- % Done changed from 50 to 80
Our income team has pointed out that the fimnlabgrs is a variable which is the sum of variables, and it is these components that are topcoded. So you can get the pattern you find if someone reports several income sources which are high but not all are topcoded. The topcoded flag is 1 if at least one component is top-coded.
For example, the rule is to topcode all income components at 100 and person a and person b report:
Person a: income component1 = £102
Person b: income component1 = £99 and income component2 = £50
Person a gets topcoded on their single reported income source and has total income £100 & top-coded flag=1
Person b doesnt get topcoded on either source and has total income £149 & top-coded flag=0
Hope this helps.
I think the main thing to say is they are working with a variable which is the sum of variables that are topcoded. So you can get the pattern they observe if someone reports several income sources which are high but not all topcoded.
Consider we topcode all income sources at 100 and person a and person b report the below:
Person a: 102
Person b: 99 and 50
Person a gets topcoded on their single reported income source and has total income 100
Person b doesnt get topcoded on either source and has total income 149