layout: true --- class: middle, inverse ## .center[Designing Conjoint Surveys with {cbcTools}] .leftcol40[ <center> <img src="https://jhelvy.github.io/2023-qux-conf-conjoint/images/logo.png" width=300> </center> ] .rightcol60[ ###
John Paul Helveston ###
The George Washington University | Dept. of Engineering Management and Systems Engineering ###
June 15, 2023 ] --- class: center ### Designing a Choice-Based Conjoint Survey is Hard -- <center> <img src="images/tradeoffs1.png" width=85%> </center> --- class: center ### Designing a Choice-Based Conjoint Survey is Hard <center> <img src="images/tradeoffs2.png" width=85%> </center> --- class: center ### Designing a Choice-Based Conjoint Survey is Hard <center> <img src="images/tradeoffs3.png" width=85%> </center> --- # Many R packages for design of experiment - [{cbcTools}](https://github.com/jhelvy/cbcTools) - [{ExpertChoice}](https://github.com/JedStephens/ExpertChoice) - [{support.CEs}](https://www.jstatsoft.org/article/view/v050c02) - [{idefix}](https://www.jstatsoft.org/article/view/v096i03) - [{choiceDes}](https://cran.r-project.org/web/packages/choiceDes/index.html) --- # Many R packages for design of experiment - [{cbcTools}](https://github.com/jhelvy/cbcTools) .red[<-- Does a lot more than just DOE!] - [{ExpertChoice}](https://github.com/JedStephens/ExpertChoice) - [{support.CEs}](https://www.jstatsoft.org/article/view/v050c02) - [{idefix}](https://www.jstatsoft.org/article/view/v096i03) - [{choiceDes}](https://cran.r-project.org/web/packages/choiceDes/index.html) --- class: middle, center, inverse ## .center[A systematic workflow for designing a CBC experiment] <center> <img src="images/logo.png" width=30%> </center> --- background-image: url("images/process.png") ## .center[A systematic workflow for designing a CBC experiment] --- background-image: url("images/process_labels.png") ## .center[A systematic workflow for designing a CBC experiment] --- class: center, middle background-image: url("images/process_labels.png") .border[ <center> <img src="images/cbc_screenshot.png" width=80%> </center> ] --- background-image: url("images/process_levels.png") --- class: center background-color: #fff ## Example CBC question about apples <center> <img src="images/example.png" width=800> </center> --- background-color: #fff ## .center[Define the attributes and levels] <br> .leftcol30[ <center> <img src="images/fuji.jpg" width=250> </center> ] .rightcol70[ <br> - **Price ($/lb)**: 1.00, 1.50, 2.00, 2.50, 3.00, 3.50, 4.00 - **Type**: Fuji, Gala, Honeycrisp - **Freshness**: Excellent, Average, Poor ] --- background-image: url("images/process_profiles.png") --- # .center[Generate all possible profiles] <br> ```r profiles <- cbc_profiles( price = seq(1, 4, 0.5), # $ per pound type = c('Fuji', 'Gala', 'Honeycrisp'), freshness = c('Poor', 'Average', 'Excellent') ) ``` -- .leftcol[ ```r head(profiles) ``` ``` #> profileID price type freshness #> 1 1 1.0 Fuji Poor #> 2 2 1.5 Fuji Poor #> 3 3 2.0 Fuji Poor #> 4 4 2.5 Fuji Poor #> 5 5 3.0 Fuji Poor #> 6 6 3.5 Fuji Poor ``` ] .rightcol[ ```r tail(profiles) ``` ``` #> profileID price type freshness #> 58 58 1.5 Honeycrisp Excellent #> 59 59 2.0 Honeycrisp Excellent #> 60 60 2.5 Honeycrisp Excellent #> 61 61 3.0 Honeycrisp Excellent #> 62 62 3.5 Honeycrisp Excellent #> 63 63 4.0 Honeycrisp Excellent ``` ] --- ## .center[Generate a restricted set of profiles?] -- > CAUTION: including restrictions in your designs can substantially reduce the statistical power of your design, so use them cautiously (and avoid them if possible). -- ```r restricted_profiles <- cbc_restrict( profiles, type == "Gala" & price %in% c(1.5, 2.5, 3.5), type == "Honeycrisp" & price < 2, type == "Fuji" & freshness == "Poor" ) dim(restricted_profiles) ``` ``` #> [1] 41 4 ``` --- background-image: url("images/process_design.png") --- # .center[Generate a survey design] ```r design <- cbc_design( profiles = profiles, n_resp = 300, # Number of respondents n_alts = 3, # Number of alternatives per question n_q = 6 # Number of questions per respondent ) ``` -- ```r head(design) ``` ``` #> profileID respID qID altID obsID price type freshness #> 1 53 1 1 1 1 2.5 Gala Excellent #> 2 45 1 1 2 1 2.0 Fuji Excellent #> 3 33 1 1 3 1 3.0 Gala Average #> 4 19 1 2 1 2 3.0 Honeycrisp Poor #> 5 14 1 2 2 2 4.0 Gala Poor #> 6 28 1 2 3 2 4.0 Fuji Average ``` --- # .center[Include a "no choice" option] ```r design <- cbc_design( profiles = profiles, n_resp = 300, # Number of respondents n_alts = 3, # Number of alternatives per question n_q = 6, # Number of questions per respondent * no_choice = TRUE ) ``` -- ```r head(design) ``` ``` #> profileID respID qID altID obsID price type_Fuji type_Gala type_Honeycrisp freshness_Poor freshness_Average freshness_Excellent no_choice #> 1 6 1 1 1 1 3.5 1 0 0 1 0 0 0 #> 2 1 1 1 2 1 1.0 1 0 0 1 0 0 0 #> 3 27 1 1 3 1 3.5 1 0 0 0 1 0 0 #> 4 0 1 1 4 1 0.0 0 0 0 0 0 0 1 #> 5 48 1 2 1 2 3.5 1 0 0 0 0 1 0 #> 6 1 1 2 2 2 1.0 1 0 0 1 0 0 0 ``` --- ## .center[Make a labeled design] .center[.font100[(aka "alternative-specific design")]] ```r design <- cbc_design( profiles = profiles, n_resp = 300, # Number of respondents n_alts = 3, # Number of alternatives per question n_q = 6, # Number of questions per respondent * label = "type" ) ``` -- ```r head(design) ``` ``` #> profileID respID qID altID obsID price type freshness #> 1 22 1 1 1 1 1.0 Fuji Average #> 2 55 1 1 2 1 3.5 Gala Excellent #> 3 63 1 1 3 1 4.0 Honeycrisp Excellent #> 4 28 1 2 1 2 4.0 Fuji Average #> 5 54 1 2 2 2 3.0 Gala Excellent #> 6 57 1 2 3 2 1.0 Honeycrisp Excellent ``` --- # .center[Make a Bayesian D-efficient design] ### .center[(Uses the [`idefix`](https://www.jstatsoft.org/article/view/v096i03) package to generate a design)] ```r design <- cbc_design( profiles = profiles, n_resp = 300, # Number of respondents n_alts = 3, # Number of alternatives per question n_q = 6, # Number of questions per respondent * priors = list( * price = -0.1, # Numeric, modeled as continuous * type = c(0.1, 0.2), # Reference level: "Fuji" * freshness = c(0.1, 0.2) # Reference level: "Poor" * ) ) ``` Priors are defining the following model: $$ u_j = -0.1p_j + 0.1t^{Gala}_j + 0.2t^{Honeycrisp}_j + 0.1f^{Ave}_j + 0.2f^{Excellent}_j + \varepsilon_j $$ --- # .center[Import a design: .blue[Sawtooth]
] ```r library(readr) design <- read_csv('design.csv') head(design) ``` ``` #> respID qID altID obsID price type freshness #> 1 1 1 1 1 1.0 Fuji Average #> 2 1 1 2 1 3.5 Gala Excellent #> 3 1 1 3 1 4.0 Honeycrisp Excellent #> 4 1 2 1 2 4.0 Fuji Average #> 5 1 2 2 2 3.0 Gala Excellent #> 6 1 2 3 2 1.0 Honeycrisp Excellent ``` --- background-image: url("images/process_inspect.png") --- # .center[Check design **balance**] ```r cbc_balance(design) ``` -- .leftcol[ ``` Individual attribute level counts price: 1 1.5 2 2.5 3 3.5 4 784 755 759 741 776 827 758 type: Fuji Gala Honeycrisp 1800 1800 1800 freshness: Poor Average Excellent 1845 1767 1788 ``` ] -- .rightcol[ ``` Pairwise attribute level counts price x type: Fuji Gala Honeycrisp NA 1800 1800 1800 1 784 260 256 268 1.5 755 248 254 253 2 759 259 240 260 2.5 741 239 254 248 3 776 263 286 227 3.5 827 264 258 305 4 758 267 252 239 ``` ] --- # .center[Check design **overlap**] ```r cbc_overlap(design) ``` -- .leftcol[ ``` Counts of attribute overlap: (# of questions with N unique levels) price: 1 2 3 31 630 1139 type: 1 2 3 156 1248 396 freshness: 1 2 3 175 1189 436 ``` ] --- background-image: url("images/process_choices.png") --- # .center[Simulate random choices] ```r data <- cbc_choices( design = design, obsID = "obsID" ) ``` -- ```r head(data) ``` ``` #> profileID respID qID altID obsID price type freshness choice #> 1 22 1 1 1 1 1.0 Fuji Average 0 #> 2 55 1 1 2 1 3.5 Gala Excellent 0 #> 3 63 1 1 3 1 4.0 Honeycrisp Excellent 1 #> 4 28 1 2 1 2 4.0 Fuji Average 1 #> 5 54 1 2 2 2 3.0 Gala Excellent 0 #> 6 57 1 2 3 2 1.0 Honeycrisp Excellent 0 ``` --- ## .center[Simulate choices according to a prior] .center[(Fixed coefficients)] .leftcol[ ```r data <- cbc_choices( design = design, obsID = "obsID", * priors = list( * price = -0.1, * type = c(0.1, 0.2), * freshness = c(0.1, -0.2) * ) ) ``` ] .rightcol[ Attribute | Level | Utility ----------|----------- **Price** | Continuous | -0.1 **Type** | Fuji | 0 | Gala | 0.1 | Honeycrisp | 0.2 **Freshness** | Average | 0 | Excellent | 0.1 | Poor | -0.2 ] --- ## .center[Simulate choices according to a prior] .center[(Random coefficients...currently supports Normal & Log-normal)] .leftcol[ ```r data <- cbc_choices( design = design, obsID = "obsID", priors = list( price = -0.1, * type = randN( * mu = c(0.1, 0.2), * sigma = c(0.5, 1) * ), freshness = c(0.1, -0.2) ) ) ``` ] .rightcol[ Attribute | Level | Utility ----------|----------- **Price** | Continuous | -0.1 **Type** | Fuji | 0 | Gala | N(0.1, 0.5) | Honeycrisp | N(0.2, 1) **Freshness** | Average | 0 | Excellent | 0.1 | Poor | -0.2 ] --- ## .center[Simulate choices according to a prior] .center[(Models with interactions)] .leftcol[ ```r data <- cbc_choices( design = design, obsID = "obsID", priors = list( price = -0.1, type = c(0.1, 0.2), freshness = c(0.1, -0.2), * "price*type" = c(0.1, 0.5) ) ) ``` ] .rightcol[ Attribute | Level | Utility ----------|----------- **Price** | Continuous | -0.1 **Type** | Fuji | 0 | Gala | 0.1 | Honeycrisp | 0.2 **Freshness** | Average | 0 | Excellent | 0.1 | Poor | -0.2 **Price x Type** | Fuji | 0 | Gala | 0.1 | Honeycrisp | 0.5 ] --- background-image: url("images/process_power.png") --- # .center[Conduct a power analysis] ```r power <- cbc_power( nbreaks = 10, n_q = 6, data = data, obsID = "obsID", outcome = "choice", pars = c("price", "type", "freshness") ) ``` -- .leftcol[ ```r head(power) ``` ``` #> sampleSize coef est se #> 1 30 price -0.18969925 0.09519838 #> 2 30 typeGala -0.03074145 0.19389450 #> 3 30 typeHoneycrisp 0.19956629 0.18165533 #> 4 30 freshnessAverage 0.46713033 0.23479744 #> 5 30 freshnessExcellent 0.47712173 0.22886577 #> 6 60 price -0.13185249 0.06576991 ``` ] .rightcol[ ```r tail(power) ``` ``` #> sampleSize coef est se #> 45 270 freshnessExcellent -0.14791440 0.07498391 #> 46 300 price -0.11983143 0.02896963 #> 47 300 typeGala 0.08577075 0.05984688 #> 48 300 typeHoneycrisp 0.22142284 0.05810356 #> 49 300 freshnessAverage 0.17092085 0.07083180 #> 50 300 freshnessExcellent -0.11784026 0.07093520 ``` ] --- # .center[Conduct a power analysis] ```r plot(power) ``` <img src="figs/unnamed-chunk-24-1.png" width="576" /> --- # .center[Conduct a power analysis] .leftcol[ ```r power_int <- cbc_power( nbreaks = 10, n_q = 6, data = data, pars = c( "price", "type", "freshness", * "price*type" ), outcome = "choice", obsID = "obsID" ) ``` ] .rightcol[ ```r plot(power_int) ``` <img src="figs/unnamed-chunk-26-1.png" width="576" /> ] --- background-image: url("images/process_labels.png") <center> <img src="images/logo.png" width=20%> </center> --- background-image: url("images/sawtooth-cbcTools.png") --- background-color: #fff background-image: url("images/cbcTools-sawtooth.png") background-position: center background-size: contain --- class: inverse
−
+
10
:
00
# Your turn - Be sure to have downloaded and unzipped the [practice code](https://jhelvy.github.io/2023-qux-conf-conjoint/practice/2023-qux-conf-conjoint.zip). - Open the `2023-qux-conf-conjoint.Rproj` file to open RStudio. - In RStudio, open the `designing-surveys.R` file. - Experiment with different design options, then examine the power: - What if you modify the quesitons per respondent? - What if you use a labeled design? - What if you include a "no choice" option? - What if you use a Bayesian D-efficient design? --- class: inverse <br> ## .center[Back to workshop website:<br><br> https://jhelvy.github.io/2023-qux-conf-conjoint/] .footer-large[ .right[ @JohnHelveston
<br> @jhelvy
<br> @jhelvy
<br> jhelvy.com
<br> jph@gwu.edu
]]