This function estimates the same model multiple times using different size subsets of a set of choice data. The number of models to run is set by the nbreaks argument, which breaks up the data into groups of increasing sample sizes. All models are estimated models using the logitr package.

estimateModels(
  nbreaks = 10,
  nQPerResp = 1,
  data,
  outcome,
  obsID,
  pars,
  price = NULL,
  randPars = NULL,
  randPrice = NULL,
  modelSpace = "pref",
  weights = NULL,
  panelID = NULL,
  clusterID = NULL,
  robust = FALSE,
  startParBounds = c(-1, 1),
  startVals = NULL,
  numMultiStarts = 1,
  useAnalyticGrad = TRUE,
  scaleInputs = TRUE,
  standardDraws = NULL,
  numDraws = 50,
  vcov = FALSE,
  predict = FALSE,
  options = list(print_level = 0, xtol_rel = 1e-06, xtol_abs = 1e-06, ftol_rel = 1e-06,
    ftol_abs = 1e-06, maxeval = 1000, algorithm = "NLOPT_LD_LBFGS")
)

Arguments

nbreaks

The number of different sample size groups.

nQPerResp

Number of questions per respondent. Defaults to 1 if not specified.

data

The data, formatted as a data.frame object.

outcome

The name of the column that identifies the outcome variable, which should be coded with a 1 for TRUE and 0 for FALSE.

obsID

The name of the column that identifies each observation.

pars

The names of the parameters to be estimated in the model. Must be the same as the column names in the data argument. For WTP space models, do not include price in pars.

price

The name of the column that identifies the price variable. Required for WTP space models. Defaults to NULL.

randPars

A named vector whose names are the random parameters and values the distribution: 'n' for normal or 'ln' for log-normal. Defaults to NULL.

randPrice

The random distribution for the price parameter: 'n' for normal or 'ln' for log-normal. Only used for WTP space MXL models. Defaults to NULL.

modelSpace

Set to 'wtp' for WTP space models. Defaults to "pref".

weights

The name of the column that identifies the weights to be used in model estimation. Defaults to NULL.

panelID

The name of the column that identifies the individual (for panel data where multiple observations are recorded for each individual). Defaults to NULL.

clusterID

The name of the column that identifies the cluster groups to be used in model estimation. Defaults to NULL.

robust

Determines whether or not a robust covariance matrix is estimated. Defaults to FALSE. Specification of a clusterID or weights will override the user setting and set this to `TRUE' (a warning will be displayed in this case). Replicates the functionality of Stata's cmcmmixlogit.

startParBounds

sets the lower and upper bounds for the starting parameters for each optimization run, which are generated by runif(n, lower, upper). Defaults to c(-1, 1).

startVals

is vector of values to be used as starting values for the optimization. Only used for the first run if numMultiStarts > 1. Defaults to NULL.

numMultiStarts

is the number of times to run the optimization loop, each time starting from a different random starting point for each parameter between startParBounds. Recommended for non-convex models, such as WTP space models and mixed logit models. Defaults to 1.

useAnalyticGrad

Set to FALSE to use numerically approximated gradients instead of analytic gradients during estimation. For now, using the analytic gradient is faster for MNL models but slower for MXL models. Defaults to TRUE.

scaleInputs

By default each variable in data is scaled to be between 0 and 1 before running the optimization routine because it usually helps with stability, especially if some of the variables have very large or very small values (e.g. > 10^3 or < 10^-3). Set to FALSE to turn this feature off. Defaults to TRUE.

standardDraws

By default, a new set of standard normal draws are generated during each call to logitr (the same draws are used during each multistart iteration). The user can override those draws by providing a matrix of standard normal draws if desired. Defaults to NULL.

numDraws

The number of Halton draws to use for MXL models for the maximum simulated likelihood. Defaults to 50.

vcov

Set to TRUE to evaluate and include the variance-covariance matrix and coefficient standard errors in the returned object. Defaults to FALSE.

predict

If FALSE, predicted probabilities, fitted values, and residuals are not included in the returned object. Defaults to TRUE.

options

A list of options for controlling the nloptr() optimization. Run nloptr::nloptr.print.options() for details.

Value

Returns a nested data frame with each estimated model object in the model column.

Examples

library(conjointTools)

# Define the attributes and levels
levels <- list(
  price     = seq(1, 4, 0.5), # $ per pound
  type      = c('Fuji', 'Gala', 'Honeycrisp', 'Pink Lady', 'Red Delicious'),
  freshness = c('Excellent', 'Average', 'Poor')
)

# Make a full-factorial design of experiment and recode the levels
doe <- makeDoe(levels)
doe <- recodeDoe(doe, levels)

# Make the survey
survey <- makeSurvey(
    doe       = doe,  # Design of experiment
    nResp     = 2000, # Total number of respondents (upper bound)
    nAltsPerQ = 3,    # Number of alternatives per question
    nQPerResp = 6     # Number of questions per respondent
)

# Simulate random choices for the survey
data <- simulateChoices(
    survey = survey,
    obsID  = "obsID"
)

# Estimate models with different sample sizes
models <- estimateModels(
    nbreaks = 10,
    data    = data,
    pars    = c("price", "type", "freshness"),
    outcome = "choice",
    obsID   = "obsID"
)