Advances in Survival Analysis: 23 (Handbook of Statistics)

Choice and Interpretation of Statistical Tests Used When Competing Risks Are Present

Thus, we estimated the probability of recovery. Note that there is no difference on the effects for treatment and gender on the two transitions Table 3. Note that, due to potential selection bias, caution must be exercised to interpret these estimates. Estimated effects for two types of transition in multi-state models for recurrent respiratory infections in small children in Brazil.

Given the relative lack of agreement regarding appropriate methods for analysing recurrences using survival analysis, we described the relevant methodological issues and illustrated how to fit and interpret results for different approaches. Analysis based only on the first event time cannot be used to examine the effect of the risk factors on the number of recurrences over time.

However, they consider the total number of events per a fixed period of time, ignoring the time between repeated occurrences. In addition, it is not possible to identify whether the effect of exposures changes the rate of occurrence across the time period. Even though we focus on methods for analysis of ordered failure times, many studies present sources of correlated unordered failure times. For example, times to an event of interest collected on family members are unordered and correlated because they share genetic and environmental factors; similarly, times to the same event type in two organs are pairwise correlated.

The methods described here are also useful for analysis of such data, considering some adjustments. Several approaches have been proposed to account for intra-subject correlation that rises from multiple events settings in survival analysis. The biological process of the disease is fundamental when choosing the model for the time to recurrent events. For instance, it is possible that after experiencing the first infection, the risk of the next infection may increase.

If it is reasonable to assume that the risk of recurrent events remained constant regardless of the number of previous events, then the AG model is recommended.

Publications

However, omission of an important covariate could induce dependence. In such case, the standard errors would be underestimated, causing inflation of type I error. A possible remedy would be to fit an AG model with a time-dependent covariate for the number of events. Advantages of an AG model include the ability to accommodate time-varying covariates and discontinuous intervals of risk.

If it is reasonable to assume that the occurrence of the first event increases the likelihood of a recurrence, then PWP is recommended. A limitation for the use of PWP models is that the risk sets for the later events get quite small, making the estimates unstable. Therefore, we usually have to truncate the data. These models are also useful in many applications where there are multiple types of events and it is of interest to simultaneously describe marginal aspects of them.

Interval censoring

Incorporation of time-varying covariates can also lead to different interpretations depending on the adopted approach. The frailty models are indicated when a subject-specific random effect can explain the unmeasured heterogeneity that cannot be explained by covariates alone, which leads to a person-specific interpretation of the estimates in a similar way as that for mixed models for analysis of longitudinal data.

When random effects are large, a smaller number of events seems to be adequate, otherwise a larger number of events would be necessary. Due to lack of software developments for fitting MSM, this approach has been rarely applied to analysis of recurrent event data to date. We discussed known approaches under independent censoring assumption for analysis of recurrent event data.

Methods dealing with dependent censoring have been proposed, 30 , 31 but they have not been incorporated into major software. We attempted to illustrate methodological issues through analyses of recurrence events in a cancer study and in a study related to an infectious disease, describing interpretation of results obtained from different approaches.

All models allow estimation of overall effects and most of them are easily approached using standard statistical software. Fit of frailty models and MSM, however, is less accessible. In this paper we fitted all models for both applications in order to illustrate their use, software implementation and interpretation of estimates in scenarios with different data structures. We truncated our datasets to have the same number of events for all approaches to illustrate the methods and to allow a more direct comparison between the models.

Nevertheless, we were also able to use full data for analysis using the AG model, marginal rates model and frailty model. We did both analyses with full and truncated data using the aforementioned approaches. The results from the analysis with full data were not included in the manuscript for simplicity. In summary, the choice of the approach for analysis of recurrent event data will be determined by many factors, including: Usually the stratified models, as PWP total or gap times or multi-state models, are used when there are few recurrent events per subject and the risk of recurrence varies between recurrences.

Many statistical challenges arise when performing analyses of repeated time-to-event data and the researcher should be careful to address them adequately. We recommend the following basic steps for analysing recurrent time-to-event data: In this paper, we briefly described the main characteristics of models for analysing recurrent time-to-event data and presented information on how to prepare the data Appendix 1, available as Supplementary data at IJE online and to specify the commands in three statistical software Appendix 2, available as Supplementary data at IJE online.

A recurrent events model can help to gain insights into the disease process. Hence, it is very important to consider the use of as much data as possible and to conduct analysis that can enhance a comprehensive understanding of the role of the risk factors in the disease process.

Supplementary data are available at IJE online. National Center for Biotechnology Information , U. Published online Dec 9. Author information Article notes Copyright and License information Disclaimer. Accepted Oct This article has been cited by other articles in PMC. Abstract In many biomedical studies, the event of interest can occur more than once in a participant. Recurrent events, time-to-event data, survival modelling.

  • Hitchers!
  • International Aviation Law: A Practical Guide.
  • Scary Tales for Kids: One Last Run - Para-Octane;

Introduction Many diseases and clinical outcomes may recur in the same patient. Methods Review of the general theory Two important features of recurrent event data are that the events are ordered and that the subject can only be at risk for one such event at a time. Open in a separate window. Schematic plot for recurrent time-to-event data for five hypothetical subjects. Models The Andersen and Gill model The counting process model of Andersen-Gill AG generalizes the Cox model, which is formulated in terms of increments in the number of events along the time line.

Prentice, Williams and Peterson models The Prentice, Williams and Peterson PWP model analyses ordered multiple events by stratification, based on the prior number of events during the follow-up period. The frailty model The random effects approach, also called the frailty model, introduces a random covariate into the model that induces dependence among the recurrent event times.

Multi-state models The simplest multi-state model MSM is defined for two states: Existing software We provide syntax for fitting each model using SAS, Stata and R software, 23—25 highlighting major differences, particularly on required data structure and available results Appendix 1, 2 and 3, available as Supplementary data at IJE online. Bladder cancer We consider data from a study with 85 bladder cancer patients designed to evaluate the effect of two treatment arms thiotepa or placebo on tumour recurrence.

Acute lower respiratory tract infections Data from a double-blinded randomized clinical trial with children followed for 1 year to evaluate the impact of high doses of vitamin A on diarrhoea and acute lower respiratory tract infections ALRI was used. Results Bladder cancer There were 47 first bladder cancer recurrences, and 83 subsequent recurrences. HR, hazard ratio; CI, confidence interval; mo, months.

Discussion Given the relative lack of agreement regarding appropriate methods for analysing recurrences using survival analysis, we described the relevant methodological issues and illustrated how to fit and interpret results for different approaches. Several approaches have been proposed in the literature to account for intra-subject correlation that arises from recurrent events in survival analysis.

The five reviewed models for analysis of recurrent time-to-event data differ in assumptions and in interpretation of the results. Choice of the appropriate approach for analysis of recurrent event data is determined by many factors, including number of events, relationship between consecutive events, effects that may or may not vary across recurrences, biological process, dependence structure and research question.

Many statistical challenges arise when analysing recurrent time-to-event data and the researcher should be careful to address them adequately. Supplementary Material Supplementary Data: Click here to view. Repeated occurrence of basal cell carcinoma of the skin and multifailure survival analysis: Am J Epidemiol ; Regression models and life-tables with Discussion. J Royal Stat Soc B ; Ann Stat ; On the regression analysis of multivariate failure time data. Regression analysis of multivariate incomplete failure time data by modelling marginal distributions.

J Am Stat Assoc ; Extending The Cox Model. Pepe MS, Cai J. Some graphical displays and marginal regression analysis for recurrent failure times and time dependent covariates. Semiparametric regression for the mean and rate functions of recurrent events. J R Stat Soc B ; Repeated hospitalizations and self-rated health among the elderly: Comparison of regression models for the analysis of fall risk factors in older veterans.

Ann Epidemiol ; Multivariate time-to-event models for studies of recurrent childhood diseases. Int J Epidemiol ; Survival analysis for recurrent event data: Stat Med ; The Statistical Analysis of Recurrent Events. A SAS macro for estimating transition probabilities in semiparametric models for recurrent events. Computer Methods Programmes Biomed ; Appraisal of several methods to model time to multiple events per subject: Zeng D, Lin DY. Efficient estimation of semiparametric transformation models for counting processes. Effect of vitamin A supplementation on diarrhoea and acute lower-respiratory-tract infections in young children in Brazil.

The analysis of recurrent events for multiple subjects. J R Stat Soc C ; Cai J, Schaubel D. Analysis of recurrent event data. Balakrishnan N, Rao CR. Advances in Survival Analysis. Regression splines in the time-dependent coefficient rates model for recurrent event data. Andersen PK, Keiding N. Multi-state models for event history analysis. Stat Methods Med Res ; Multi-state models for the analysis of time-to-event data. R Development Core Team. A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Gharibvand L, Liu L.

Analysis of survival data with clustered events. Analysis of multiple failure-time data with Stata. More recently, Tian and Cai [ 60 ] and Xue et al. In addition to the models discussed above, another attractive semiparametric regression model is the additive hazards model given by. It specifies that the effects of the covariates are additive rather than multiplicative as in model 4.

There was a problem providing the content you requested

Martinussen and Scheike [ 63 ] studies the same problem and provided an approach that can be more efficient than that given in Lin et al. However, the latter involves estimation of the baseline hazard function and can be much more complicated. For inference about model 7 based on case II interval-censored data, both Huang and Wellner [ 50 ] and Zeng et al. Furthermore, Chen and Sun [ 65 ] and Zhu et al. The four semiparametric models described above are all specific models in terms of the functional form of the effects of covariates.

Sometimes one may prefer a model that gives more flexibility. One such model is the linear transformation model that specifies the relationship between the failure time T and the covariate Z as. Several other models or generalizations of the models discussed above are also available for regression analysis of interval-censored failure time data. For example, one may apply the partial linear model given by [ 70 ].

Shiboski [ 71 ] presented some generalized additive models and Zhang and Davidian [ 72 ] recently gave a group of smooth semiparametric regression models. In addition to the semiparametric models discussed above, one particular family of parametric models, piecewise exponential models, is worth mentioning.

Obviously the model defined above can be seen as an approximation to model 4 or 7. Although the piecewise exponential model may be less flexible parametric than the semiparametric model, it is simple and one can readily apply the maximum likelihood approach for parameter estimation. Of course, for the use of the piecewise exponential model, one needs to specify the partition, which may not be straightforward sometimes. As an alternative, by assuming the piecewise exponential proportional hazards model with the partition points at This gives the same p -value as above for testing no treatment effect.

The results here give similar conclusions to those obtained in the previous sections and indicate that the patients in the RCT group had the 2. We remark that in reality, in addition to the analysis described above, one may also want to perform some model diagnosis or checking [ 3 ] , or to fit other models to the data. In the previous sections, we only discussed several basic issues in the analysis of interval-censored failure time data and there exist a number of other issues that were not touched. In this section, we will briefly discuss some of them including multivariate interval-censored data, doubly censored data, competing risks analysis of interval-censored data, and informatively interval-censored data as well as interval-censored data with truncation and parametric procedures.

Multivariate interval-censored data arise if a survival study involves several related survival variables of interest and each of them suffers interval censoring. It is apparent that in this case, one needs different inference procedures than those discussed above and one key and important feature of these different procedures is that they need to take into account the correlation among the survival variables.

Frailty Model and its Application to Seizure Data

In addition to the basic issues discussed before, a new and unique issue for multivariate data is to make inference about the association between the survival variables. For this and in general, one of the tools that are commonly used for the analysis of multivariate interval-censored data but not univariate data is the copula model, which provides a very flexible way to model the joint survival function. For the analysis of multivariate interval-censored data, in addition to the methods described in Sun [ 3 ] , some of the more recently developed approaches include those given in Chen et al.

The first considered regression analysis of general multivariate interval-censored data by using model 5 , while the last two presented efficient estimation procedures for fitting bivariate current status data to models 4 and 5 , respectively. Also Cook et al. One of the early work on the analysis of doubly censored failure time data is given by the seminal paper De Gruttola and Lagakos [ 4 ]. In the paper, they proposed a self-consistency algorithm for estimating the distribution of the survival variable of interest.

Following their work, many authors considered various issues related to the analysis of doubly censored data and in addition to the review article Sun [ 78 ] , Sun [ 3 ] devoted one chapter for the analysis of doubly censored data. Competing risks analysis is needed when the failure on an individual may be one of several distinct failure types. For example, death of a cancer patient may be classified as disease-related or non-disease-related. In the case of current status data, Groeneboom et al.

As mentioned above, all methods discussed so far require the assumption 1 or 2. That is, the interval censoring involved is noninformative about the survival variable or event of interest. Several inference procedures have been developed in the literature for situations where the censoring may be informative [ 3 ]. For this, a common way is to jointly model the survival variable and the variables representing interval censoring by, for example, using the latent variable approach [ 83 , 84 ].

It is well-known that truncation may occur in survival studies and especially may occur together with interval censoring [ 19 ]. Although several procedures have been developed for one-sample estimation problem, there does not exist much literature on the topic. The same is true for the investigation of the use of parametric models and inference procedures for the analysis of interval-censored data.

One major reason for this is that in most situations, there does not exist much prior information about the variable under study and thus one may prefer nonparametric or semiparametric approaches rather than parametric approaches. It is definitely important to implement the available inference procedures numerically for practitioners. Unfortunately and surprisingly, there is no commercially available statistical software yet that provides an extensive coverage for interval-censored data.

This is perhaps due to the complexity of both the algorithms and the theory behind it. In R, the package Icens contains some routines that can perform statistical analysis when interval-censored data are present. A recent tutorial paper [ 86 ] has some details and examples of how to use R for interval-censored data. Methodologically, there are still many open questions in the analysis of interval-censored data.

Examples include but are not limited to model checking techniques and joint modeling of longitudinal and interval-censored data. Some of the methods discussed in the previous sections also need proper theoretical justification. The major difficulty is that there lacks basic tools as simple and elegant as the partial likelihood and the martingale theory for right-censored data. The work by Groeneboom and Wellner [ 9 ] and Huang and Wellner [ 50 ] are perhaps the most comprehensive studies for interval censoring, which mainly rely on complicated empirical processes and the optimization theory and are difficult to generalize.

The authors wish to thank Professor Per Kragh Andersen and a reviewer for their helpful comments. National Center for Biotechnology Information , U. Stat Methods Med Res. Author manuscript; available in PMC Jun Zhigang Zhang and Jianguo Sun. A Find articles by Zhigang Zhang. A Find articles by Jianguo Sun. Author information Copyright and License information Disclaimer. The publisher's final edited version of this article is available at Stat Methods Med Res.

See other articles in PMC that cite the published article. Open in a separate window. Then the problem becomes testing the null hypothesis H 0: Acknowledgments The authors wish to thank Professor Per Kragh Andersen and a reviewer for their helpful comments. A proportional hazards model for interval-censored failure time data. The statistical analysis of failure time data. The statistical analysis of interval-censored failure time data.

Analysis of doubly-censored survival data, with application to AIDS. Generalizations of current status data with applications. Sun J, Kalbfleisch JD. The analysis of current status data on point processes.

Journal of the American Statistical Association. Age-specific incidence and prevalence: A statistical perspective with discussion Journal of the Royal Statistical Society: Regression analysis of tumor prevalence data. Groeneboom P, Wellner JA. DMV Seminar, Band Information bounds and nonparametric maximum likelihood estimation.

Statistical analysis of doubly interval-censored failure time data. Balakrishnan N, Rao CR, editors. A semiparametric model for regression analysis of interval-censored failure time data. The Canadian Journal of Statistics. On nonidentifiability and noninformative censoring for current status data.

Lawless JF, Babineau D. Models for interval censoring and simulation-based inference for lifetime distributions. Gentleman R, Geyer CJ. Maximum likelihood for interval censored data: An EM algorithm for estimating survival functions with interval-censored data. Scandinavian Journal of Statistics. The empirical distribution with arbitrarily grouped censored and truncated data.

Journal of the Royal Statistical Society: Maximum likelihood from incomplete data via the EM algorithm. The iterative convex minorant algorithm for nonparametric estimation. Journal of Computational and Graphical Statistics.

INTRODUCTION

Order restrict statistical inference. Wellner JA, Zhan Y. A hybrid algorithm for computation of the nonparametric maximum likelihood estimator from censored data. A note on the non-parametric maximum likelihood estimator of the distribution function. Statistical inference under order restrictions.

Constrained estimation and likelihood intervals for censored data.

Research topics

Sen B, Banerjee M. A pseudolikelihood method for analyzing interval censored data. On consistency of self-consistent estimator of survival functions with interval-censored data. Geskus R, Groeneboom P. Asymptotically optimal estimation of smooth functionals for interval censoring, case 2.

The Annals of Statistics. Huang J, Wellner JA. Kooperberg C, Stone CJ. Logspline density estimation for censored data. Hazard function estimation using B-splines. Multiple imputation for simple estimation of the hazard function based on interval censored data. Local EM estimation of the hazard function for interval-censored data. Zhao Q, Sun J. Generalized log-rank test for mixed interval-censored failure time data.

Logrank-type tests for comparing survival curves with interval-censored data. A two-sample test for stochastic ordering with interval-censored data. Nonparametric survival comparison for interval-censored continuous Data. A k -sample test with interval censored data. Generalized log rank tests for interval-censored failure time data. Peto R, Peto J. Asymptotically efficient rank invariant test procedures. Generalized log-rank tests for partly interval-censored failure time data.

A nonparametric test for comparing two samples where all observations are either left- or right-censored. A nonparametric test for current status data with unequal censoring. Regression models and life-tables with discussion Journal of the Royal Statistical Society: Efficient estimation for the proportional hazards model with interval censoring. Rank-based inference in the proportional hazards model for interval censored data. A Markov chain Monte Carlo EM algorithm for analyzing interval censored data under the Cox proportional hazards model.

A multiple imputation approach to Cox regression with interval-censored data. Interval censored survival data: Lin D, Fleming T, editors. Proceedings of the first Seattle symposium in biostatistics: Kim Y, Jhun M. Cure rate model with interval censored data.

Maximum likelihood estimation for proportional odds regression model with current status data. Analysis of censored cata. Rossini A, Tsiatis AA. A semiparametric proportional odds regression model for the analysis of current status data. Huang J, Rossini AJ. Sieve estimation for the proportional odds failure-time regression model with interval censoring. Proportional odds regression and sieve maximum likelihood estimation. Using conditional logistic regression to fit proportional odds models to interval censored data. Regression with interval-censored data.

Computationally simple accelerated failure time regression for interval censored data. Li L, Pu Z. Rank estimation of log-linear regression with interval-censored data. Tian L, Cai T. On the accelerated failure time model for current status and interval censored data. Additive hazards regression with current status data. Martinussen T, Scheike TH. Efficient estimation in additive hazards regression with current status data.

Semiparametric additive risks model for interval-censored data. Chen L, Sun J. A multiple imputation approach to the analysis of current status data with the additive hazards model. A transformation approach for the analysis of interval-censored failure time data. Sun J, Sun L. Semiparametric linear transformation models for current status data. Younes N, Lachin J. Linked-based models for survival data with interval and continuous time censoring. Regression analysis of interval censored failure time data with linear transformation models. Sieve maximum likelihood estimation for semiparametric regression models with current status data.

Generalized additive models for current status data. Zhang M, Davidian M. The proportional odds model for multivariate interval-censored failure time data. Efficient estimation for the proportional hazards model with bivariate current status data. Efficient estimation for the proportional odds model with bivariate current status data.