Once a model has been estimated, it can be used to predict choices for a set of alternatives. This vignette demonstrates examples of how to so using the `predictChoices()`

function along with the results of an estimated model.

To predict choices, you first need to define a set of alternatives for which you want to make predictions. Each row should be an alternative, and each column should be an attribute. I will predict choices on the full `yogurt`

data set, which was used to estimate each of the models used in this example.

This example uses the yogurt data set from Jain et al. (1994). The data set contains 2,412 choice observations from a series of yogurt purchases by a panel of 100 households in Springfield, Missouri, over a roughly two-year period. The data were collected by optical scanners and contain information about the price, brand, and a “feature” variable, which identifies whether a newspaper advertisement was shown to the customer. There are four brands of yogurt: Yoplait, Dannon, Weight Watchers, and Hiland, with market shares of 34%, 40%, 23% and 3%, respectively.

```
head(yogurt)
#> id obsID alt choice price feat brand dannon hiland weight yoplait
#> 1 1 1 1 0 8.1 0 dannon 1 0 0 0
#> 2 1 1 2 0 6.1 0 hiland 0 1 0 0
#> 3 1 1 3 1 7.9 0 weight 0 0 1 0
#> 4 1 1 4 0 10.8 0 yoplait 0 0 0 1
#> 5 1 2 1 1 9.8 0 dannon 1 0 0 0
#> 6 1 2 2 0 6.4 0 hiland 0 1 0 0
```

In the example below, I estimate a preference space MNL model called `mnl_pref`

. I can then use the `predictChoices()`

function with the `mnl_pref`

model to predict the choices for each set of alternatives in the `yogurt`

data set:

```
# Estimate the model
mnl_pref <- logitr(
data = yogurt,
choice = 'choice',
obsID = 'obsID',
pars = c('price', 'feat', 'brand')
)
# Predict choices
choices_mnl_pref <- predictChoices(
model = mnl_pref,
alts = yogurt,
altID = "alt",
obsID = "obsID"
)
```

```
# Preview actual and predicted choices
head(choices_mnl_pref[c('obsID', 'choice', 'choice_predict')])
#> obsID choice choice_predict
#> 1.1 1 0 0
#> 1.2 1 0 0
#> 1.3 1 1 1
#> 1.4 1 0 0
#> 2.5 2 1 1
#> 2.6 2 0 0
```

The resulting `choices_mnl_pref`

data frame contains the same `alts`

data frame with an additional column, `choice_predict`

, which contains the predicted choices. You can quickly compute the accuracy by dividing the number of correctly predicted choices by the total number of choices:

You can also use WTP space models to predict choices. For example, here are the results from an equivalent model but in the WTP space:

```
# Estimate the model
mnl_wtp <- logitr(
data = yogurt,
choice = 'choice',
obsID = 'obsID',
pars = c('feat', 'brand'),
price = 'price',
modelSpace = 'wtp',
numMultiStarts = 10
)
# Make predictions
choices_mnl_wtp <- predictChoices(
model = mnl_wtp,
alts = yogurt,
altID = "alt",
obsID = "obsID"
)
```

```
#> NOTE: Using results from run 8 of 10 multistart runs
#> (the run with the largest log-likelihood value)
```

You can also use mixed logit models to predict choices. Heterogeneity is modeled by simulating draws from the population estimates of the estimated model. Here is an example using a preference space mixed logit model:

```
# Estimate the model
mxl_pref <- logitr(
data = yogurt,
choice = 'choice',
obsID = 'obsID',
pars = c('price', 'feat', 'brand'),
randPars = c(feat = 'n', brand = 'n'),
numMultiStarts = 5
)
# Make predictions
choices_mxl_pref <- predictChoices(
model = mxl_pref,
alts = yogurt,
altID = "alt",
obsID = "obsID"
)
```

Likewise, mixed logit WTP space models can also be used to predict choices:

```
# Estimate the model
mxl_wtp <- logitr(
data = yogurt,
choice = 'choice',
obsID = 'obsID',
pars = c('feat', 'brand'),
price = 'price',
randPars = c(feat = 'n', brand = 'n'),
modelSpace = 'wtp',
numMultiStarts = 5
)
# Make predictions
choices_mxl_wtp <- predictChoices(
model = mxl_wtp,
alts = yogurt,
altID = "alt",
obsID = "obsID"
)
```

```
library(dplyr)
# Combine models into one data frame
choices <- rbind(
choices_mnl_pref, choices_mnl_wtp, choices_mxl_pref, choices_mxl_wtp)
choices$model <- c(
rep("mnl_pref", nrow(choices_mnl_pref)),
rep("mnl_wtp", nrow(choices_mnl_wtp)),
rep("mxl_pref", nrow(choices_mxl_pref)),
rep("mxl_wtp", nrow(choices_mxl_wtp)))
# Compute prediction accuracy by model
choices %>%
filter(choice == 1) %>%
mutate(predict_correct = (choice_predict == choice)) %>%
group_by(model) %>%
summarise(p_correct = sum(predict_correct) / n())
#> # A tibble: 4 × 2
#> model p_correct
#> <chr> <dbl>
#> 1 mnl_pref 0.390
#> 2 mnl_wtp 0.362
#> 3 mxl_pref 0.390
#> 4 mxl_wtp 0.379
```

The models all perform about the same with ~38% correct predictions. This is significantly better than random predictions, which should be 25%.