cutpointr: Improved Estimation and Validation of Optimal Cutpoints in R
Dublin Core
Title
cutpointr: Improved Estimation and Validation of Optimal Cutpoints in R
Subject
optimal cutpoint, ROC curve, bootstrap, R.
Description
“Optimal cutpoints” for binary classification tasks are often established by testing
which cutpoint yields the best discrimination, for example the Youden index, in a specific
sample. This results in “optimal” cutpoints that are highly variable and systematically
overestimate the out-of-sample performance. To address these concerns, the cutpointr
package offers robust methods for estimating optimal cutpoints and the out-of-sample
performance. The robust methods include bootstrapping and smoothing based on kernel
estimation, generalized additive models, smoothing splines, and local regression. These
methods can be applied to a wide range of binary-classification and cost-based metrics.
cutpointr also provides mechanisms to utilize user-defined metrics and estimation methods. The package has capabilities for parallelization of the bootstrapping, including reproducible random number generation. Furthermore, it is pipe-friendly, for example for
compatibility with functions from tidyverse. Various functions for plotting receiver operating characteristic curves, precision recall graphs, bootstrap results and other representations of the data are included. The package contains example data from a study on
psychological characteristics and suicide attempts suitable for applying binary classification algorithms.
which cutpoint yields the best discrimination, for example the Youden index, in a specific
sample. This results in “optimal” cutpoints that are highly variable and systematically
overestimate the out-of-sample performance. To address these concerns, the cutpointr
package offers robust methods for estimating optimal cutpoints and the out-of-sample
performance. The robust methods include bootstrapping and smoothing based on kernel
estimation, generalized additive models, smoothing splines, and local regression. These
methods can be applied to a wide range of binary-classification and cost-based metrics.
cutpointr also provides mechanisms to utilize user-defined metrics and estimation methods. The package has capabilities for parallelization of the bootstrapping, including reproducible random number generation. Furthermore, it is pipe-friendly, for example for
compatibility with functions from tidyverse. Various functions for plotting receiver operating characteristic curves, precision recall graphs, bootstrap results and other representations of the data are included. The package contains example data from a study on
psychological characteristics and suicide attempts suitable for applying binary classification algorithms.
Creator
Christian Thiele
Source
https://www.jstatsoft.org/article/view/v098i11
Publisher
University of Applied
Sciences Bielefeld
Sciences Bielefeld
Date
May 2021
Contributor
Fajar bagus W
Format
PDF
Language
Inggris
Type
Text
Files
Collection
Citation
Christian Thiele, “cutpointr: Improved Estimation and Validation of Optimal Cutpoints in R,” Repository Horizon University Indonesia, accessed March 12, 2025, https://repository.horizon.ac.id/items/show/8197.