Skip to contents

Finds optimal cut-point number (0 to `max_cuts`) for a Cox model by comparing AIC, AICc, or BIC. Supports systematic search (`max_cuts <= 2`) and genetic algorithm (`rgenoud`).

Usage

find_cutpoint_number(
  data,
  predictor,
  outcome_time,
  outcome_event,
  method = "systematic",
  criterion = "BIC",
  covariates = NULL,
  max_cuts = 2,
  nmin = 0.1,
  seed = NULL,
  maxiter = 100,
  ...
)

# S3 method for class 'find_cutpoint_number_result'
print(x, ...)

# S3 method for class 'find_cutpoint_number_result'
summary(
  object,
  show_comparison_table = TRUE,
  show_best_model_details = TRUE,
  show_group_counts = TRUE,
  show_medians = TRUE,
  plot.it = FALSE,
  ...
)

# S3 method for class 'find_cutpoint_number_result'
plot(x, y, ...)

Arguments

data

Input data frame.

predictor

Continuous predictor variable name (character).

outcome_time

Time-to-event variable name (character).

outcome_event

Event indicator name (0/1) (character).

method

`"systematic"` (max_cuts <= 2) or `"genetic"`.

criterion

`"AIC"`, `"AICc"` or `"BIC"`.

covariates

Character vector of covariate names (optional).

max_cuts

Max number of cut-points to test (non-negative int).

nmin

Min. group size (count or proportion).

seed

Integer or `NULL`; random seed for `rgenoud`.

maxiter

Integer; generations for `rgenoud` (default 100).

...

Additional arguments passed to `rgenoud`.

x

An object from [find_cutpoint_number()].

object

An object from [find_cutpoint_number()].

show_comparison_table

Logical. Show model comparison table?

show_best_model_details

Logical. Show details for best model?

show_group_counts

Logical. Show group counts for best model?

show_medians

Logical. Show median survival for best model?

plot.it

Logical. Display model selection plot?

y

Unused.

Value

An S3 object (`find_cutpoint_number_result`) with `results`, `parameters`, `userdata`, `optimal_num_cuts`, and `optimal_cuts`.

Details

`method = "systematic"`: grid search respecting `nmin`. `method = "genetic"`: `rgenoud` global optimization. Systematic search is slow for `max_cuts > 2`; use `genetic`.

srrstats compliance

.

References

Akaike, H. (1974). A new look at the statistical model identification. *IEEE Transactions on Automatic Control*, **19**(6), 716–723. doi:10.1109/TAC.1974.1100705

Chang, C., Hsieh, M.-K., Chang, W.-Y., Chiang, A. J., & Chen, J. (2017). Determining the optimal number and location of cutoff points with application to data of cervical cancer. *PLOS ONE*, 12(4), e0176231. doi:10.1371/journal.pone.0176231

Chen, Y., Huang, J., He, X., Gao, Y., Mahara, G., Lin, Z., & Zhang, J. (2019). A novel approach to determine two optimal cut-points of a continuous predictor with a U-shaped relationship to hazard ratio in survival data: Simulation and application. *BMC Medical Research Methodology*, 19(1), 96. doi:10.1186/s12874-019-0738-4

Schwarz, G. (1978). Estimating the dimension of a model. *The Annals of Statistics*, **6**(2), 461–464. doi:10.1214/aos/1176344136

Hurvich, C. M., & Tsai, C.-L. (1989). Regression and time series model selection in small samples. *Biometrika*, **76**(2), 297–307. doi:10.1093/biomet/76.2.297

Examples

data(crc_virome)
res <- find_cutpoint_number(
  data = head(crc_virome, 50),
  predictor = "Alphapapillomavirus",
  outcome_time = "time_months",
  outcome_event = "status",
  method = "systematic",
  max_cuts = 1
)
#>  nmin 0.1 is a proportion. Min. group size set to 5.
#>  Finding optimal cut number: method = systematic
#>  Testing for 1 cut-point(s)...
#> Warning: Loglik converged before variable  1 ; coefficient may be infinite. 
#> Warning: Loglik converged before variable  1 ; coefficient may be infinite. 
#> Warning: Loglik converged before variable  1 ; coefficient may be infinite. 
#> Warning: Loglik converged before variable  1 ; coefficient may be infinite. 
#> Warning: Loglik converged before variable  1 ; coefficient may be infinite. 
#> Warning: Loglik converged before variable  1 ; coefficient may be infinite. 
#> Warning: Loglik converged before variable  1 ; coefficient may be infinite. 
#> Warning: Loglik converged before variable  1 ; coefficient may be infinite. 
#> Warning: Loglik converged before variable  1 ; coefficient may be infinite. 
#> Warning: Loglik converged before variable  1 ; coefficient may be infinite. 
#> Warning: Loglik converged before variable  1 ; coefficient may be infinite. 
#> Warning: Loglik converged before variable  1 ; coefficient may be infinite. 
#> Warning: Loglik converged before variable  1 ; coefficient may be infinite. 
#> Warning: Loglik converged before variable  1 ; coefficient may be infinite. 
#> 
#> ── Optimal Cut-point Number Analysis ───────────────────────────────────────────
#> Method: systematic
#> Criterion: BIC
#>  num_cuts   BIC Delta_BIC BIC_Weight    Evidence cuts
#>         0 58.47         4      11.9%    Moderate   NA
#>         1 54.47         0      88.1% Substantial 4.59
#>  Conclusion: 1 cut-point(s) is best based on BIC.
#> Optimal cuts at: 4.59
#> Hint: Use `summary()` for details, `plot()` to visualize.