Abstract: Length biased sampling (LBS) arises when items are sampled in proportion to their values on a random variable of interest. For example, older units may be more likely to be sampled simply because they have been in service for a longer period of time. The effect of this sampling bias on the mean is well known when the length-biased-sampled random variable, say Y, is observable.
A more difficult situation arises when Y is not observed, but the outcome of another random variable, Z, is observed and is correlated with Y. This scenario arises in evaluating screening programs: screening identifies cases during the preclinical phase, the duration of which is unobserved but is correlated with the clinical duration. Length-biased preclinical durations are more likely to be screen-detected than shorter ones and also may have better prognosis, irrespective of screening. Survival is further biased by the effects of lead time and overdiagnosis. We demonstrate the implications of these biases, propose a survival time model that incorporates them, and offer a robust approach to jointly estimating the components of survival, including extended benefit time, in screening programs. The approach is illustrated using data from six actual randomized cancer screening trials.
This work is conducted in collaboration with Dr. Philip C. Prorok, National Cancer Institute.
