Support #2255
openRequest for Feedback on Weighting Strategy for Longitudinal Event History Analysis
50%
Description
Dear Understanding Society Support Team,
I am writing to seek your advice on the weighting strategy I am using for a longitudinal Event History analysis based on the Understanding Society data. I have consulted the available documentation and discussed with other researchers, but given the specific structure of my data and research design, I would appreciate your expert opinion.
My data setup:
I have constructed a long-format monthly panel dataset, where each respondent appears in multiple rows. I follow individuals from their entry into the sample until one of the following:
-they experience the event (first birth),
-they exit the reproductive age window, or
-they drop out of the panel (non-intermittent response only).
As a result, different respondents exit the analysis at different waves. I observed that many weights are zero, which I assume is because the respondent is not part of the OSM (original sample members).
My weighting strategy:
I use the longitudinal individual weights (_indscus_lw) from each wave (b, c, ..., n).
For each wave, I compute the mean weight across all individuals with non-missing value. I use this to rescale the weights:
prefix_longitudinalweight = prefix_indscus_lw / mean(prefix_indscus_lw).
This ensures the rescaled weights have a mean of 1. This ensures that the rescaled weights have a mean of 1. Since my dataset is in monthly long format, each individual appears multiple times — once for each month they are observed — and their original weight is repeated across those rows. However, I think that because I compute a mean, this repetition does not affect the validity of the rescaling, as each individual’s weight contributes proportionally to the average.
I calculate the total rescaled weight for each wave by summing the rescaled weights:
prefix_totalweight = sum(prefix_longitudinalweight).
I then generate a constant variable per wave containing this total for all individuals.
I compute an average total weight across all waves: average_longitudinal.
I calculate a scaling factor for each wave:
prefix_scale = average_longitudinal / prefix_totalweight.
I apply the scaling factor to the rescaled weight:
prefix_weight_rescaled = prefix_scale * prefix_longitudinalweight.
Finally, for each respondent, I assign their weight based on the last wave in which they are observed (prior to their event or censoring).
For respondents who are only observed in wave 1, I use the weight from wave 2.
I would be grateful if you could let me know whether this approach is methodologically sound, particularly in the context of a monthly, long-format Event History analysis with varying exit points across individuals.
Thank you in advance for your time and support. It is truly appreciated.
Best wishes,
Irene Frageri
Files