Use this function to estimate multinomial (MNL) and mixed logit (MXL)
models with "Preference" space or "Willingness-to-pay" (WTP) space utility
parameterizations. The function includes an option to run a multistart
optimization loop with random starting points in each iteration, which is
useful for non-convex problems like MXL models or models with WTP space
utility parameterizations. The main optimization loop uses the nloptr()
function to minimize the negative log-likelihood function.
Usage
logitr(
data,
outcome,
obsID,
pars,
scalePar = NULL,
randPars = NULL,
randScale = NULL,
modelSpace = NULL,
weights = NULL,
panelID = NULL,
clusterID = NULL,
robust = FALSE,
correlation = FALSE,
startValBounds = c(-1, 1),
startVals = NULL,
numMultiStarts = 1,
useAnalyticGrad = TRUE,
scaleInputs = TRUE,
standardDraws = NULL,
drawType = "halton",
numDraws = 50,
numCores = NULL,
vcov = FALSE,
predict = TRUE,
options = list(print_level = 0, xtol_rel = 1e-06, xtol_abs = 1e-06, ftol_rel = 1e-06,
ftol_abs = 1e-06, maxeval = 1000, algorithm = "NLOPT_LD_LBFGS"),
price,
randPrice,
choice,
parNames,
choiceName,
obsIDName,
priceName,
weightsName,
clusterName,
cluster
)Arguments
- data
The data, formatted as a
data.frameobject.- outcome
The name of the column that identifies the outcome variable, which should be coded with a
1forTRUEand0forFALSE.- obsID
The name of the column that identifies each observation.
- pars
The names of the parameters to be estimated in the model. Must be the same as the column names in the
dataargument. For WTP space models, do not include thescaleParvariable inpars.- scalePar
The name of the column that identifies the scale variable, which is typically "price" for WTP space models, but could be any continuous variable, such as "time". Defaults to
NULL.- randPars
A named vector whose names are the random parameters and values the distribution:
'n'for normal,'ln'for log-normal, or'cn'for zero-censored normal. Defaults toNULL.- randScale
The random distribution for the scale parameter:
'n'for normal,'ln'for log-normal, or'cn'for zero-censored normal. Only used for WTP space MXL models. Defaults toNULL.- modelSpace
This argument is no longer needed as of v0.7.0. The model space is now determined based on the
scaleParargument: ifNULL(the default), the model will be in the preference space, otherwise it will be in the WTP space. Defaults toNULL.- weights
The name of the column that identifies the weights to be used in model estimation. Defaults to
NULL.- panelID
The name of the column that identifies the individual (for panel data where multiple observations are recorded for each individual). Defaults to
NULL.- clusterID
The name of the column that identifies the cluster groups to be used in model estimation. Defaults to
NULL.- robust
Determines whether or not a robust covariance matrix is estimated. Defaults to
FALSE. Specification of aclusterIDorweightswill override the user setting and set this to `TRUE' (a warning will be displayed in this case). Replicates the functionality of Stata's cmcmmixlogit.- correlation
Set to
TRUEto account for correlation across random parameters (correlated heterogeneity). Defaults toFALSE.- startValBounds
sets the
lowerandupperbounds for the starting parameter values for each optimization run, which are generated byrunif(n, lower, upper). Defaults toc(-1, 1).- startVals
is vector of values to be used as starting values for the optimization. Only used for the first run if
numMultiStarts > 1. Defaults toNULL.- numMultiStarts
is the number of times to run the optimization loop, each time starting from a different random starting point for each parameter between
startValBounds. Recommended for non-convex models, such as WTP space models and mixed logit models. Defaults to1.- useAnalyticGrad
Set to
FALSEto use numerically approximated gradients instead of analytic gradients during estimation. For now, using the analytic gradient is faster for MNL models but slower for MXL models. Defaults toTRUE.- scaleInputs
By default each variable in
datais scaled to be between 0 and 1 before running the optimization routine because it usually helps with stability, especially if some of the variables have very large or very small values (e.g.> 10^3or< 10^-3). Set toFALSEto turn this feature off. Defaults toTRUE.- standardDraws
By default, a new set of standard normal draws are generated during each call to
logitr(the same draws are used during each multistart iteration). The user can override those draws by providing a matrix of standard normal draws if desired. Defaults toNULL.- drawType
Specify the draw type as a character:
"halton"(the default) or"sobol"(recommended for models with more than 5 random parameters).- numDraws
The number of Halton draws to use for MXL models for the maximum simulated likelihood. Defaults to
50.- numCores
The number of cores to use for parallel processing of the multistart. Set to
1to serially run the multistart. Defaults toNULL, in which case the number of cores is set toparallel::detectCores() - 1. Max cores allowed is capped atparallel::detectCores().- vcov
Set to
TRUEto evaluate and include the variance-covariance matrix and coefficient standard errors in the returned object. Defaults toFALSE.- predict
If
FALSE, predicted probabilities, fitted values, and residuals are not included in the returned object. Defaults toTRUE.- options
A list of options for controlling the
nloptr()optimization. Runnloptr::nloptr.print.options()for details.- price
No longer used as of v0.7.0 - if provided, this is passed to the
scaleParargument and a warning is displayed.- randPrice
No longer used as of v0.7.0 - if provided, this is passed to the
randScaleargument and a warning is displayed.- choice
No longer used as of v0.4.0 - if provided, this is passed to the
outcomeargument and a warning is displayed.- parNames
No longer used as of v0.2.3 - if provided, this is passed to the
parsargument and a warning is displayed.- choiceName
No longer used as of v0.2.3 - if provided, this is passed to the
outcomeargument and a warning is displayed.- obsIDName
No longer used as of v0.2.3 - if provided, this is passed to the
obsIDargument and a warning is displayed.- priceName
No longer used as of v0.2.3 - if provided, this is passed to the
scaleParargument and a warning is displayed.- weightsName
No longer used as of v0.2.3 - if provided, this is passed to the
weightsargument and a warning is displayed.- clusterName
No longer used as of v0.2.3 - if provided, this is passed to the
clusterIDargument and a warning is displayed.- cluster
No longer used as of v0.2.3 - if provided, this is passed to the
clusterIDargument and a warning is displayed.
Value
The function returns a list object containing the following objects.
| Value | Description |
coefficients | The model coefficients at convergence. |
logLik | The log-likelihood value at convergence. |
nullLogLik | The null log-likelihood value (if all coefficients are 0). |
gradient | The gradient of the log-likelihood at convergence. |
hessian | The hessian of the log-likelihood at convergence. |
probabilities | Predicted probabilities. Not returned if predict = FALSE. |
fitted.values | Fitted values. Not returned if predict = FALSE. |
residuals | Residuals. Not returned if predict = FALSE. |
startVals | The starting values used. |
multistartNumber | The multistart run number for this model. |
multistartSummary | A summary of the log-likelihood values for each multistart run (if more than one multistart was used). |
time | The user, system, and elapsed time to run the optimization. |
iterations | The number of iterations until convergence. |
message | A more informative message with the status of the optimization result. |
status | An integer value with the status of the optimization (positive values are successes). Use statusCodes() for a detailed description. |
call | The matched call to logitr(). |
inputs | A list of the original inputs to logitr(). |
data | A list of the original data provided to logitr() broken up into components used during model estimation. |
numObs | The number of observations. |
numParams | The number of model parameters. |
freq | The frequency counts of each alternative. |
modelType | The model type, 'mnl' for multinomial logit or 'mxl' for mixed logit. |
weightsUsed | TRUE or FALSE for whether weights were used in the model. |
numClusters | The number of clusters. |
parSetup | A summary of the distributional assumptions on each model parameter ("f"="fixed", "n"="normal distribution", "ln"="log-normal distribution"). |
parIDs | A list identifying the indices of each parameter in coefficients by a variety of types. |
scaleFactors | A vector of the scaling factors used to scale each coefficient during estimation. |
standardDraws | The draws used during maximum simulated likelihood (for MXL models). |
options | A list of options for controlling the nloptr() optimization. Run nloptr::nloptr.print.options() for details. |
Details
The the options argument is used to control the detailed behavior of the
optimization and must be passed as a list, e.g. options = list(...).
Below are a list of the default options, but other options can be included.
Run nloptr::nloptr.print.options() for more details.
| Argument | Description | Default |
xtol_rel | The relative x tolerance for the nloptr optimization loop. | 1.0e-6 |
xtol_abs | The absolute x tolerance for the nloptr optimization loop. | 1.0e-6 |
ftol_rel | The relative f tolerance for the nloptr optimization loop. | 1.0e-6 |
ftol_abs | The absolute f tolerance for the nloptr optimization loop. | 1.0e-6 |
maxeval | The maximum number of function evaluations for the nloptr optimization loop. | 1000 |
algorithm | The optimization algorithm that nloptr uses. | "NLOPT_LD_LBFGS" |
print_level | The print level of the nloptr optimization loop. | 0 |
Examples
# For more detailed examples, visit
# https://jhelvy.github.io/logitr/articles/
library(logitr)
# Estimate a MNL model in the Preference space
mnl_pref <- logitr(
data = yogurt,
outcome = "choice",
obsID = "obsID",
pars = c("price", "feat", "brand")
)
#> Running model...
#> Done!
# Estimate a MNL model in the WTP space, using a 5-run multistart
mnl_wtp <- logitr(
data = yogurt,
outcome = "choice",
obsID = "obsID",
pars = c("feat", "brand"),
scalePar = "price",
numMultiStarts = 5
)
#> Running multistart...
#> Random starting point iterations: 5
#> Number of cores: 3
#> Done!
# Estimate a MXL model in the Preference space with "feat"
# following a normal distribution
# Panel structure is accounted for in this example using "panelID"
mxl_pref <- logitr(
data = yogurt,
outcome = "choice",
obsID = "obsID",
panelID = "id",
pars = c("price", "feat", "brand"),
randPars = c(feat = "n")
)
#> Running model...
#> Done!
