Measures the quality of prediction intervals by combining their width with a penalty for observations falling outside the interval. Smaller scores indicate better calibrated and narrower intervals.
Details
$$ W_i = \begin{cases} (u_i - l_i) + \frac{2}{\alpha}(l_i - y_i), & \text{if } y_i < l_i \\ (u_i - l_i), & \text{if } l_i \le y_i \le u_i \\ (u_i - l_i) + \frac{2}{\alpha}(y_i - u_i), & \text{if } y_i > u_i \end{cases} $$ where \(l_i\) and \(u_i\) are the lower and upper bounds of the prediction interval, \(y_i\) is the observed value, and \(\alpha = 1 - \text{level}/100\) is the significance level. The Winkler score is then the mean of \(W_i\) over all observations.
Dictionary
This mlr3::Measure can be instantiated via the dictionary mlr3::mlr_measures or with the associated sugar function mlr3::msr():
Meta Information
Task type: “regr”
Range: \([0, \infty)\)
Minimize: TRUE
Average: macro
Required Prediction: “quantiles”
Required Packages: mlr3, mlr3forecast
References
Winkler, L R (1972). “A Decision-Theoretic Approach to Interval Estimation.” Journal of the American Statistical Association, 67(337), 187–191.
See also
Chapter in the mlr3book: https://mlr3book.mlr-org.com/chapters/chapter2/data_and_basic_modeling.html#sec-eval
Package mlr3measures for the scoring functions.
as.data.table(mlr_measures)for a table of available Measures in the running session (depending on the loaded packages).Extension packages for additional task types:
mlr3proba for probabilistic supervised regression and survival analysis.
mlr3cluster for unsupervised clustering.
Other Measure:
mlr_measures_fcst.acf1,
mlr_measures_fcst.coverage,
mlr_measures_fcst.mase,
mlr_measures_fcst.mda,
mlr_measures_fcst.mdpv,
mlr_measures_fcst.mdv,
mlr_measures_fcst.mpe,
mlr_measures_fcst.rmsse
Super classes
mlr3::Measure -> mlr3::MeasureRegr -> MeasureWinkler