Imputation function to be used with the mice package

make_penalized_blots(
  data,
  quantiles = c(0.01, 0.05, 0.1, 0.25, 0.5, 0.75, 0.9, 0.95, 0.99),
  ...
)

mice.impute.qs(
  y,
  ry,
  x,
  wy = NULL,
  quantiles = c(0.01, 0.05, 0.1, 0.25, 0.5, 0.75, 0.9, 0.95, 0.99),
  baseline_quantile = 0.5,
  algorithm = "two_pass",
  tails = "gaussian",
  parallel = F,
  calc_se = F,
  weights = NULL,
  control = qs_control(calc_r2 = F, calc_avg_me = F),
  std_err_control = se_control(),
  ...
)

Arguments

data

data to be interpolated by mice

quantiles

vector of quantiles to be estimated

...

other arguments to be passed to quantreg_spacing

y

vector to be imputed

ry

indicator for complete cases

x

independent variables

wy

cases to be imputed

baseline_quantile

baseline quantile to measure spacings from (defaults to 0.5)

algorithm

What algorithm to use for fitting underlying regressions. Either one of "sfn", "br", "lasso", "post_lasso", or a function name which estimates quantiles. Defaults to sfn for now.

tails

what distribution to use when fitting the tails, either "gaussian" or "exponential"

parallel

whether to run bootstrap in parallel

calc_se

boolean, whether or not to calculate standard errors. Defaults to FALSE.

weights

optional vector of weights for weighted quantile regression

control

control parameters to pass to the control arguments of quantreg_spacing, the lower-level function called by qs. This is set via the function qs_control, which returns a named list, with elements including:

  • trunc: whether to truncate residual values below the argument "small"

  • small: level of "small" values to guarentee numerical stability. If not specified, set dynamically based on the standard deviation of the outcome variable.

  • lambda: For penalized regression, you can specify a level of lambda which will weight the penalty. If not set, will be determined based on 10-fold cross-validation.

  • output_quantiles: whether to save fitted quantiles as part of the function output

  • calc_avg_me: whether to return average marginal effects as part of the fitted object

  • lambda: the penalization factor to be passed to penalized regression algorithms

std_err_control

control parameters to pass to the control arguments of quantreg_spacing, the lower-level function called by standard_errors. Constructed via the se_control function. Possible arguments include:

  • se_method: Method to use for standard errors, either "weighted_bootstrap", "subsample", "bootstrap" or "custom" along with a specified subsampling method and subsample percent. If specifying "custom", must also specify subsampling_percent and draw_weights. If you specify "subsample", subsampling percent defaults to 0.2, but can be changed. See details for details.

  • num_bs: Number of bootstrap iterations to use, defaults to 100.

  • subsample_percent: A number between 0 and one, specifying the percent of the data to subsample for standard error calculations

  • draw_weights: Whether to use random exponential weights for bootstrap, either TRUE or FALSE

  • sampling_method One of "leaveRows", "subsampleRows", or "bootstrapRows". leaveRows doesn't resample rows at all. subsampleRows samples without replacement given some percentage of the data (specified via subsample_percent), and bootstrapRows samples with replacement.`

Examples

if (FALSE) {
library(mice)
x <- rnorm(10000)
x[sample(1:length(x), 100)] <- NA
x <- matrix(x, ncol = 10)

# get optimal lambdas from CV search based on complete data
bl <- make_penalized_blots(x)

# pass those to the lasso and get imputations
imputations = mice::mice(x, m = 10,
                         defaultMethod = c("qs", "logreg", "polyreg", "polr"),
                         blots = bl, algorithm = "lasso")
}