Skip to contents

This function generates a ggplot visual representation to compare the predicted versus empirical cumulative distributions of Probability Integral Transform (PIT) values at a local level. It is useful for diagnosing the calibration in different regions within the dataset, since miscalibration patterns may differ across the covariate space. The function allows for customization of the plot layers to suit specific needs. For advanced customization of the plot layers, refer to the ggplot2 User Guide.

Usage

gg_CD_local(
  pit_local,
  psz = 0.01,
  abline = "black",
  pal = "Set2",
  facet = FALSE,
  ...
)

Arguments

pit_local

A data frame of local PIT-values, typically obtained from PIT_local().

psz

Double indicating the size of the points on the plot. Default is 0.001.

abline

Color of the diagonal line. Default color is "red".

pal

Palette name from RColorBrewer for coloring the plot. Default is "Set2".

facet

Logical value indicating if a separate visualization for each subgroup is preferred. Default is FALSE.

...

Additional parameters to customize the ggplot.

Value

A ggplot object displaying the cumulative distributions of PIT-values that that can be customized as needed.

Details

This funcion will work with the output of the PIT_local() function, which provides the PIT-values for each subgroup pf the covariate space in the appropriate format.

Examples


n <- 10000
split <- 0.8

mu <- function(x1){
10 + 5*x1^2
}

sigma_v <- function(x1){
30*x1
}


x <- runif(n, 1, 10)
y <- rnorm(n, mu(x), sigma_v(x))

x_train <- x[1:(n*split)]
y_train <- y[1:(n*split)]

x_cal <- x[(n*split+1):n]
y_cal <- y[(n*split+1):n]

model <- lm(y_train ~ x_train)

y_hat <- predict(model, newdata=data.frame(x_train=x_cal))

MSE_cal <- mean((y_hat - y_cal)^2)

pit_local <- PIT_local(xcal = x_cal, ycal=y_cal, yhat=y_hat, mse=MSE_cal)

gg_CD_local(pit_local)
#> Error in purrr::map(unique(pit_local$part), ~{    loc <- dplyr::filter(pit_local, part == .)    do.call(rbind, purrr::map(1:nrow(loc), ~{        c(part = loc$part[1], pit_emp = mean(loc[, 2] <= qnorm(p = dplyr::pull(loc[.,             4]), mean = dplyr::pull(loc[, 3]), sd = sqrt(MSE_cal))),             pit_pred = dplyr::pull(loc[., 4]))    }))}):  In index: 1.
#> Caused by error in `purrr::map()`:
#>  In index: 1.
#> Caused by error in `.f()`:
#> ! object 'MSE_cal' not found
gg_CD_local(pit_local, facet=TRUE)
#> Error in purrr::map(unique(pit_local$part), ~{    loc <- dplyr::filter(pit_local, part == .)    do.call(rbind, purrr::map(1:nrow(loc), ~{        c(part = loc$part[1], pit_emp = mean(loc[, 2] <= qnorm(p = dplyr::pull(loc[.,             4]), mean = dplyr::pull(loc[, 3]), sd = sqrt(MSE_cal))),             pit_pred = dplyr::pull(loc[., 4]))    }))}):  In index: 1.
#> Caused by error in `purrr::map()`:
#>  In index: 1.
#> Caused by error in `.f()`:
#> ! object 'MSE_cal' not found