Models and estimators of net survival

JG Juste Aristide Goungounga
CT Célia Touraine
NG Nathalie Grafféo
RG Roch Giorgi
request Request a Protocol
ask Ask a question
Favorite

To estimate net survival, two setting are defined according to cause of death information. When one considers that this information is available for each patient, net survival is estimated in “cause-specific setting”. But if cause of death information is unavailable, or if one wants to get rid of cause of death, net survival is estimated in “population-based setting”. In this context, cause of death information is indirectly obtained by matching the observed data with other-cause hazard drawn from general population life tables. Indeed, the tables contain daily hazard rate λPi for each matched individual i from the general population of interest. The main assumption to consider that, is the fact that the cancer part in the whole mortality is negligible. In consequence, the other-cause hazard of the studied sample is equal to that of general population of interest. In these two settings the common assumption is that excess hazard and other-cause hazard are independent.

Furthermore, in the two settings, net survival may be estimated using non-parametric estimators or parametric models; the latter allow estimating and testing effects of covariates on the excess mortality.

Some well-known estimators presented in the next sub-section (Kaplan-Meier, Nelson-Aalen, Cox) were firstly developed in the overall cause of death setting. For simplicity reason, we present their adaptation directly in the cause-specific setting, where the main change concerns the event indicator.

Here, we present briefly the properties of 1) one cause-specific method, 2) one mixed population-based and cause-specific method, and 3) one population-based method. The first is the popular Kaplan-Meier (KM) estimator in the cause-specific setting with right-censoring of the times to death from non-cancer causes. The second uses an adaptation of the Nelson-Aalen estimator to account for the informative censoring problem and uses other-cause mortality information from population life tables. The third is the Pohar-Perme (PP) estimator, a reliable estimator of net survival in population-based studies.

Estimation of net survival in the cause-specific setting leads to consider deaths from cancer as events and to right censor deaths from other causes and live patients. The KM estimator of net survival is then:

In this equation, n is the number of patients, NEt=i=1nNE,it is the number of cancer-related deaths up to the time t obtained by summing up the individual counting processes NE, i(t), and Yt=i=1nltit is the at-risk process just before time t (i.e., alive or not censored patients; the at risk process counts the subjects who did not experience the event by time t and, thus, who are still “at risk” of experiencing the event).

In the cause-specific setting, Pohar-Perme et al. proposed an adaptation of the Nelson-Aalen estimator [12]. When the assumption of independence between the censoring process and the cancer death process is violated, mainly due to age, the censoring becomes informative. Using the inverse probability of censoring weighting approach on the Nelson-Aalen estimator [12], Pohar-Perme et al. derived an asymptotically unbiased estimator of the net survival:

In this equation, NEwt=i=1nNE,iwt and Ywt=i=1nYiwt are, respectively, the weighted aggregated counting process and the at-risk process. More precisely, dNE,iwt=dNE,itSP,iti and Yiwt=YitSP,it, where each of components are weighted, respectively, by SP, i(ti−) and SP, i(t−) the inverse of the individual expected survival derived from population life tables obtained respectively at times ti− and t−. Thereafter, we called this estimator, with these types of weights, the weighted Nelson-Aalen (wNA) estimator.

The PP estimator [12] is a reliable non-parametric estimator of net survival developed to overcome some assumptions of excess hazard modeling. It corresponds to the difference between the Nelson-Aalen estimate and the cumulative population of the patients still at risk at each death, where the at-risk process and the counting process are weighted to give greater weight to subjects with high risk of other-cause mortality. The PP estimator of net survival is:

where Nwt=Niwt is the sum of the individual all-cause counting process Niwt and with dNiwt=dNitSpit, which, as Yiwt, is weighted by the inverse of the individual expected survival. This latter quantity and the general population other-cause mortality λP are derived from population life tables.

Among these non-parametric estimators, only the KM estimator is a purely cause-specific estimator because, in our case of cause-specific setting, it uses only cancer specific death information. Though used in the cause-specific setting, the wNA estimator uses also the population other-cause mortality to correct the estimation of net survival. The PP estimator is used only in population-based settings; it needs the other-cause mortality and the vital status to provide an estimate of net survival.

In the cause-specific setting, the semiparametric Cox proportional hazards model expresses the excess hazard at time t as: λE(t, X) = λE, 0(t) exp(βTX) where λE, 0(t), the baseline excess hazard at time t, and β corresponds to the proportional linear effect of covariate X on the baseline excess hazard estimated separately through the semiparametric approach.

One solution to derive the baseline cumulative excess hazard function from the Cox model was given by Breslow [20]. Using Breslow estimator with cancer death as status indicator δi, the baseline cumulative excess hazard function ΛE, 0(t) may be estimated with the expression applied to times ti at which the events take place:

In this equation, Rt denotes the risk at time t of all individuals still at risk of death from cancer at time t, β^ corresponds to the effect of covariates and Δi corresponds the indicator of death due to the cancer of interest. In this formula, we estimate unadjusted cumulative excess hazard by setting β^=0 [21].

The corresponding net survival from Breslow’s estimator is therefore:

where Λ^E,0 corresponds to the baseline cumulative excess hazard function estimated separately from β^.

The new model is an extension of the flexible parametric excess hazard model proposed by Giorgi et al. [15]. It is based on the seminal excess hazard model of Estève et al. [14] where the observed hazard of a patient i at time ti is:

and where the baseline excess hazard (λE, 0) is modelled by a piecewise constant function; β represents the effects of the vector of covariates X including demographic variables Z (such as age at diagnosis, year of diagnosis, sex, of the individual i).

In the model of Giorgi et al. [15], the baseline excess hazard and the time-dependent covariates are both modelled using specific B-spline functions. More precisely, for a higher degree of flexibility, Giorgi et al. used quadratic B-splines (order 3) and two interior knots.

For simplicity, only the case of proportional hazard effects of prognostic covariates is considered. The simplified version of Giorgi’s model is then:

where vj are the spline coefficients, Bj, 3(ti) the value at time ti of the j th B-spline of order 3 and degree 2, X the vector of covariates with proportional hazard effects β, and λPi the other-cause hazard of individual i, at age ai + ti in year yi + ti.

In agreement with Cheuvart and Ryan [19], we considered that the other-cause mortality of a participant in a clinical trial may be corrected by multiplying the population hazard obtained from the life table by a scale parameter α. This parameter is the average effect of selection on the other-cause mortality in the trial participants. This effect in the general population with same demographic characteristics equals 1 (α = 1). The new flexible model we call the “rescaled B-spline” model (RBS) can be written as follows:

To estimate the parameters of the RBS model, we used the maximum likelihood procedure. The log-likelihood of the RBS model can be written:

where, given an individual i, observation δi = 1 is the indicator of death from any cause, α the scale parameter of the instantaneous other-cause mortality λP, and ΛP the cumulated value of λP over all the follow-up duration. The rescaled cumulative population hazard αΛP may not be a constant as in the classical additive excess hazard model. In addition, the scale parameter α will be considered in the estimation process. For mathematical convenience and because all patients may die from another cause than the cancer under study, it is assumed that α > 0 and constant over time. The maximization of the log-likelihood was performed using optim function in R based on Byrd method for non-linear optimization problems with box constraints [22]. The estimates of net survival were derived from the cumulative excess hazard calculated by derivation of the corresponding estimate of the excess hazard function. The confidence interval of the net survivals was obtained with a Monte-Carlo method [17]. The R code that implements these estimation procedures is available on request from the authors.

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

0/150

tip Tips for asking effective questions

+ Description

Write a detailed description. Include all information that will help others answer your question including experimental processes, conditions, and relevant images.

post Post a Question
0 Q&A