Artiruno notebook

Created 5 Jul 2021 • Last modified 7 Jan 2024

Notes on decision aids

I'm particularly interested in software, and other means for making decisions in situations that aren't already fully quantitatively specified.

Multiattribute utility theory (MAUT)

Generally, multiattribute utility methods compute a utility for each item as the weighted sum of attribute utilities.

Simple multiattribute rating theory (SMART)

Edwards (1977) - Each item's utility is a weighted sum of the attributes. The weights are chosen by making the least important attribute 10 and choosing larger integers for the second least important, the third least important, etc. Preference functions are made for attributes by assuming linear preference from the lowest plausible value to the highest plausible value (when there aren't real units to use, the attribute is estimated on an abstract 0-to-100 scale).

Multi-attribute range evaluation (MARE)

Hodgett, Martin, Montague, and Talford (2014) - A variation on a simple weighted-sum method. The weighted-sum method normalizes attributes (by dividing each by its maximum), multiplies by attribute weights, and sums them to get a per-item score. In MARE, you specify up to three values per attribute (the minimum possible value, the most likely value, and the maximum possible value), and then compute the range of possible weighted sums for each item.

Swing weighting

A way to decide on attribute weighting for a multi-attribute utility method. You find the best and worst values of each the K attributes among the available items, and then construct K + 1 hypothetical items: one with the worst value on each attribute, and one for each attribute k that has the best value on k and the worst value on all other attributes. Then you rank these items. Then you rate each item 0 to 100. These ratings, normalized, are used as the weights of the corresponding attributes.

Analytic Hierarchy Process (AHP)

The decision-maker compares each pair of alternatives on each attribute on a 1-to-9 scale. To weight the attributes, they compare each pair of attributes on a 1-to-9 scale. Then, they can compute the priority for each item.

It's called the AHP because attributes can be optionally combined into categories of attributes, which can themselves be combined into higher-order categories, and so on.

Outranking methods

PROMETHEE

The criteria on which all the options are evaluated are either ordinal (like a rating scale) or continuous (like a price in dollars).

For each continuous criterion, the decision-maker chooses a maximum difference on the attribute about which he would be indifferent (e.g., $10 of price) and a minimum difference that saturates importance on that criterion (e.g., $1,000 of price). Between these points, the criterion is weighted linearly.

For ordinal criteria, one traditionally uses the 0–1 "usual" preference function, in which any difference at all is regarded as saturating for that criterion (so a difference of 2 steps is no more important than a difference of 1 step).

Each criterion gets a weight. Visual PROMETHEE encourages you to see how decisions change as you change the weights, rather than using a single fixed set of weights.

The PROMETHEE I method produces two metrics for each option O, one that averages the preference for O over all others, and another that averages the preference for all others over O. PROMETHEE 2 combines these by taking the difference.

Verbal decision analysis (VDA)

I've yet to find any VDA paper that specifies an algorithm for which comparisons to ask the decision-maker to make.

ZAPROS

ZAPROS III (Larichev, 2001): All attributes are ordinal. The decision-maker is asked questions like "Would you rather change from an item that's best on attribute A to an item with the second-best value on A, or from an item best on B to an item second-best on B?". Then the decision-maker is confronted with any inconsistent judgments they've made and asked to correct them. This done, a partial order can be constructed on all possible items. No weights or other numeric judgments ever need to be provided.

ZAPROS-LM (Moshkovich, Mechitov, & Olson, 2016) is a variation that asks fewer questions. I think.

Even with ZAPROS III-i (Tamanini and Pinheiro (2011)), it looks like the full preference scale is constructed, even if not necessary to choose the best alternatives.

UniComBOS

Ashikhmin and Furems (2005)

Continue a procedure until a best alternative according to U-dominance is selected.

Start with pairwise comparisons of one-criterion units. The subject can choose: A is better, B is better, they're equally good, or "don't know". If that suffices, stop; otherwise, continue to two-criterion units. Also stop if the decision-maker is inconsistent in their choices between several presentations of the same comparison.

There's another kind of inconsistency analysis, too (section 5), based on transitivity.

Other VDA software

Possible extensions to Artiruno core

  • Web interface
    • A button to prematurely abort
    • Allow setting allowed_pairs_callback
    • Show a graph of preferences
  • For either interactive mode
    • Transcripts of choices and inferences
    • Saving of choices partway through the procedure, and later replaying them
    • Proper error messages instead of assertion failures
    • More validation of the interactive input
  • A nicer interactive interface
  • Allow the agent to reply "don't know", a la UniComBOS. (UniComBOS takes this to mean that a preference for the given situation should never be inferrable from other preferences, but I think it makes more sense to understand it as totally uninformative; i.e., "skip this question and do what you can with everything else or other questions".)
  • More sophisticated alt levels
    • Allow an alternative to have a missing value (None) on a criterion
    • Allow an alt to have multiple values for a criterion
    • Allow an alt to have a fuzzy set of values for a criterion
  • Print explanations of the final ranking or choice, a la ZAPROS
  • Consistency checks
    • Check whether the agent's preferences are consistent using an all-worst-attributes reference instead of an all-best-attributes reference, a la ZAPROS
    • Check transitively deduced preferences multiple times, a la ZAPROS
  • Support group decision-making

Basic idea for an Internet study

Collect basic demographic data (after the main part of the study)

Ask subjects about a weighty decision they have to make soon which they're not already sure about and whose outcome they'll be able to assess soon (let's say, 1 month from now).

Have them briefly describe the decision to be made and the options in prose.

Conditions:

  • Artiruno condition: Then they construct the options and do VDA.
  • Comparison: Nothing else. If this study works, then maybe there could be a follow-up to investigate the effect by comparing the Artiruno condition to a condition where subjects construct options in the same style, but don't actually use VDA.

Follow up a month later.

  • Remind them of what they wrote about the decision problem (but not their options or Artiruno's suggestion).
  • Have them briefly describe in prose what decision they made, what the outcome was, and how happy they are with the outcome.
  • Have them rate the outcome on numeric scales.
  • Have them rate the decision-making process on numeric scales.

Pilot

In the pilot, just try having people set up the decision problem, so you can see that people are making reasonable choices of criteria, levels, and decision problems.

Piloting on MTurk was a failure, apparently because of the poor English skills of the subjects I got. Let's try some pre-screened subjects from Reddit instead.

I ran four screened subjects from Reddit and am reasonably happy with the results. The instructions changes these findings suggest are:

  • Mention that you should only list alternatives if they're options you really have, not options you would ideally have.
  • Provide an example of a yes-or-no decision.

Plans for the real study

To start with, aim for 40 subjects, 20 per condition, who return; this means you should probably try to recruit 80 subjects and offer more money for session 2 than session 1. You probably won't be able to get this many from /r/samplesize, but you can check other places where studies are posted, or maybe even hire a service that connects social scientists to respondents.

Set expiration times to a week and a day since your invitation.

  • Session 1 (internally, visits 1 and 2)
    • Warn about performance. ("can be particularly slow on phones and tablets")
    • Allow for session 2 in the consent form. Be clear that the subject is expected to be in both sessions, but the second will be quite short.
    • [Offer how much money?]
    • Ask for a decision description and an expected decision resolution date, as in the pilot.
    • Randomly assign the subject to the VDA condition or the control condition. (If the subject number, among subjects who've gotten this far in this version of the study, is even, counting the first as 1, then use the opposite condition of the previous subject.) Write the study state to disk here, so the subject can't refresh and enter different values depending on his assigned condition.
    • If the subject is in the control condition, skip to the end. Otherwise, continue.
    • Solicit criteria and alternatives, as in the pilot.
    • Conduct VDA.
      • Display a reminder of the subject's chosen criteria and alternatives on this page. Explain that the subject doesn't have to make the choice suggested by Artiruno, but it could be a good idea.
      • Show results as text. (The graph would probably not be very helpful in this situation, with find_best = 1 and non-quantitatively minded users.)
      • Allow the subject to restart the procedure, or to return to the previous page and edit criteria and alternatives; trying to record everything is probably futile by the nature of JavaScript.
      • Don't provide an abort button. (It wouldn't be useful because in the case of find_best = 1 and not showing a graph, Artiruno can provide no useful information before it's done.)
      • Record the questions asked, the subject's choices, the subject's response times, and Artiruno's conclusion. If the subject restarted VDA, use the values from the final round.
    • Ask for any comments.
  • Session 2 (internally, visit 3)
    • Send an invitation 1 month after the subject completed session 1. Say that session 2 assumes the subject made the choice and got to see at least a little of the outcome; if that won't be true for a while, the subject should reply saying when would be a good time to do session 2.
    • [Offer how much money?]
    • Re-display the decision description and expected resolution date.
    • Ask if they've made the choice and gotten to see some of the outcome. (In theory, this should always be "yes", because of the instructions earlier. It's a sanity check.)
    • Have the subject briefly describe in prose:
      • which choice they made
      • what the outcome was
      • how happy they are with the choice and outcome
    • Use 1-to-5 rating scales for the below.
    • How pleased are they with the outcome of their choice?
    • How well-chosen was their choice, given what they knew at the time they made it?
    • How difficult did it feel to make the decision?
    • In the VDA condition:
      • Show the criteria, alternatives, and results text from VDA.
      • How consistent did they feel their choice was with Artiruno's suggestion?
      • How difficult was the procedure (including writing up the criteria and alternatives) to do?
      • How helpful did the procedure feel for making the decision-making process?
    • Debrief
    • Another comments box

More piloting

On Prolific.

TV 7

A lot of people who seem to otherwise mostly take well to the task are messing up basic things in the VDA problem setup, particularly, putting criterion levels in the reverse order:

(.sum (getl (get ratings "pilot") (sorted (&
  (set (ssi subjects (<= $tv 7)))
  (set (. (get ratings "pilot") index))))))
I value
reversed_criterion 10
opaque 5
desc_mismatch 2

I should probably take some time to add more checks (e.g., an extra screen where subjects have to confirm that they put criterion levels in the right order) and rerun from scratch rather than have this much compromise of the VDA sample.

TV 8

I've made a lot of changes, including putting the alts input before the criteria input, checking for unchanged placeholder alt or criteria names, checking for dominance and criterion-level order, and adding a puzzle at the start of the scenario phase to encourage dropout that would happen had the subject been assigned to the VDA condition to occur earlier. (I removed display of hypothetical best and worst items because I figured it would be overkill now.) Still, of 4 new subjects assigned to the VDA condition who completed the task (120 121 127 132), 2 got the levels backwards for at least one criterion, one of whom, subject 132, seemed otherwise quite thoughtful. Ridiculous.

TVs 9, 10

Let's try labeling criteria levels with "(best)" and "(worst)" while subjects are writing them. (TV 10 adds these labels to the example problems, too.) I've removed the manual level-order checks. Since level order is officially A Problem, let's try VDA mode only to see if things are okay before returning to random assignment. I've also added a question to the demographics questionnaire about education.

The 8 subjects who I've gotten to finish VDA ((ss subjects (.isin $tv [9 10]))) all gave reasonable-looking level orders, so I think things are good now.

Screening and scenario

In TV 11, I restored the random assignment of conditions, and I slightly increased the pay and the time estimate for the scenario phase. In TV 12, I fixed a typo.

I wanted to run subjects until I had at least 40 in each condition who completed the scenario phase, so I ceased recruiting new subjects once that criterion was satisfied after my last batch. I picked this number because I wanted at least 20 subjects to have completed all phases, and I estimated a 50% follow-up rate.

(comments-dt (ssi subjects (.isin $tv [11 12])))
sn visit date comments
158 0 2023-02-23 My partner and I are currently making this decision.
158 3 2023-04-05 Thank you for the study. I hope you retrieve some useless results!
164 0 2023-02-25 Hope i am qualified. Thank you
172 2 2023-02-25 I think it's an excellent decision support system!
172 3 2023-04-05 Impressive work!
177 3 2023-04-05 I just wanted to say that although I did not follow the suggestion offered to me, I did find the process of doing this really helpful. It allowed me to give good consideration to the outcomes and possibilities in front of me at the time.
184 0 2023-02-28 N/A
190 3 2023-04-13 It is a good decision making system.
199 3 2023-04-07 Very helpful. Thankyou.
202 3 2023-04-06 interesting task!
212 0 2023-03-02 thanks
217 0 2023-03-02 Thank you for inviting me to take part in this study, it sounds very interesting.
217 2 2023-03-02 Looks like I'm buying a nice new Porsche!!
217 3 2023-04-07 Thank you for inviting me to take part, and wish me luck with my new silly chariot!
220 0 2023-03-04 submit responses
222 3 2023-04-25 This was an interesting study, thank you for inviting me.
227 2 2023-03-04 An interesting study and tool, thank you!
231 2 2023-03-05 Hi • I really enjoyed the study, specially when it started to altering between my choices. I also never wrote down a pros and cons of my decision which this task made me do it and I actually took a picture of that and shared it with my partner, so thank you very much for that. • Good luck with your research. I liked the web page design too, they were quite smart and yet very simple. • All the best.
233 3 2023-04-10 thank you
243 2 2023-03-05 Thanks
243 3 2023-05-01 Thank you for letting me take part.
245 0 2023-03-05 Thank you
245 2 2023-03-05 Thank you
245 3 2023-05-09 Interesting study. I especially appreciate the explanation on the debriefing page. Thank you
254 0 2023-03-05 Submit Responses
261 0 2023-03-06 Thank you for the opportunity to participate! There were lots of great choices.
273 3 2023-04-12 great study!
(wc
  (ss subjects
    (& (.isin $tv [11 12]) (pd.notnull $n_puzzle_attempts)))
  (cbind
    $began
    :res_len (. (- $expected_resolution_date $began) dt days)
    :puz_t (.round $time_puzzle_minutes 1)
    :puz_n $n_puzzle_attempts
    $cond
    :v2 (.round $time_visit2_minutes 1)
    :cy $country
    :edu $education_years))
sn began res_len puz_t puz_n cond v2 cy edu
149 2023-02-23 98 1.2 1 control 0.2 us 18
150 2023-02-23 57 2.3 1 vda 8.0 us 14
154 2023-02-23 20 11.1 4 vda      
155 2023-02-23 312 2.5 4 control 0.2 gb 14
156 2023-02-23 1 3.0 2 control 0.4 gb 25
158 2023-02-23 97 7.7 8 control 0.3 gb 14
159 2023-02-23 334 6.1 1 vda 18.3 gb 15
160 2023-02-25 -7 2.3 1 vda 33.9 gb 17
161 2023-02-25   1.0 1 control      
162 2023-02-25 34 3.0 1 vda 9.5 gb 12
163 2023-02-25 278 1.8 1 control 0.3 gb 20
165 2023-02-25 69 1.5 1 vda      
166 2023-02-25 -13 1.2 1 control 0.3 gb 12
167 2023-02-25 35 1.9 1 vda 9.1 gb 18
168 2023-02-25 29 3.8 4 vda      
169 2023-02-25 33 2.6 1 control 0.4 gb 13
171 2023-02-25 35 2.5 1 vda 13.6 gb 17
172 2023-02-25 19 4.1 1 vda 34.4 gb 18
173 2023-02-25 705 1.7 1 control 0.6 gb 16
174 2023-02-25 11 3.9 3 control 0.3 gb 14
175 2023-02-25 34 10.8 1 control 0.4 gb 13
177 2023-02-25 16 3.2 1 vda 9.3 gb 21
178 2023-02-25 10 7.1 12 control 0.6 gb 8
179 2023-02-25 14 15.4 12 vda 11.3 gb 10
180 2023-02-28   2.3 3 vda 4.1 us 16
181 2023-02-28 62 2.5 1 control 0.3 us 17
182 2023-02-28 31 2.4 1 control 0.8 us 18
186 2023-02-28 31 3.0 1 control 0.6 gb 15
188 2023-02-28 24 2.0 1 vda 3.3 us 17
189 2023-02-28 31 2.9 3 control 1.3 us 16
190 2023-02-28 92 2.6 2 vda 8.5 gb 17
192 2023-02-28 18 3.7 1 vda 11.4 gb 17
193 2023-02-28 27 4.2 1 control 0.5 gb 18
194 2023-02-28 132 1.6 1 control 0.4 us 18
195 2023-02-28 20 2.1 1 control 0.5 us 13
196 2023-02-28 15 1.2 1 vda 19.4 gb 14
197 2023-02-28 31 4.4 1 control 0.4 gb 16
198 2023-02-28 16 6.6 1 vda 7.1 gb 15
199 2023-02-28 17 2.0 1 vda 16.4 gb 11
200 2023-03-02 29 4.7 4 control 0.2 us 16
201 2023-03-02 13 3.9 1 vda 4.3 gb 16
202 2023-03-02 29 3.3 1 control 0.4 gb 17
204 2023-03-02 29 2.5 1 control 0.3 gb 14
205 2023-03-02 20 3.9 3 control 0.3 us 11
206 2023-03-02 29 1.8 1 control 0.5 gb 18
207 2023-03-02 29 5.2 1 control 0.5 gb 17
208 2023-03-02 5 2.7 1 vda 17.1 us 15
209 2023-03-02 29 8.1 1 control 0.5 us 16
210 2023-03-02 13 9.4 4 vda 21.1 us 14
211 2023-03-02 23 6.0 1 vda 21.1 us 14
215 2023-03-02 105 11.9 10 vda 6.8 pl 15
216 2023-03-02 29 1.8 1 control 0.4 gb 18
217 2023-03-02 3 2.6 1 vda 17.5 gb 17
218 2023-03-02 31 4.6 1 vda      
219 2023-03-02 20 2.8 1 vda 20.2 za 17
220 2023-03-04 90 6.7 1 vda      
221 2023-03-04 3 1.6 1 control 0.3 gb 18
222 2023-03-04 2 4.1 4 control 1.5 gb 16
223 2023-03-04 27 2.7 1 control 0.3 us 14
224 2023-03-04 28 5.0 6 control 0.2 us 16
225 2023-03-04 27 1.3 1 control 0.3 gb 16
226 2023-03-04 37 5.4 1 vda 5.4 gb 9
227 2023-03-04 15 2.6 1 vda 11.3 gb 16
228 2023-03-04 34 1.7 2 vda 13.1 gb 21
230 2023-03-04 22 2.6 1 control 0.4 gb 20
231 2023-03-04 72 5.1 1 vda 25.9 gb 20
232 2023-03-04 7 6.2 3 control 0.4 gb 18
233 2023-03-04 21 4.2 1 control 0.5 gb 12
235 2023-03-04 24 10.8 3 vda      
236 2023-03-04 28 2.2 1 control 0.5 gb 15
237 2023-03-04 6 1.2 1 vda 3.5 gb 14
238 2023-03-04 23 6.8 1 vda      
239 2023-03-04 303 16.0 2 vda 22.2 gb 12
241 2023-03-05 40 5.5 1 control 0.4 gb 13
242 2023-03-05 118 2.0 1 control 0.3 gb 16
243 2023-03-05 25 3.9 2 control 0.3 gb 15
244 2023-03-05 15 5.2 1 control 0.4 gb 12
245 2023-03-05 14 1.4 1 control 0.4 us 16
246 2023-03-05 49 12.0 5 vda 10.1 gb 18
247 2023-03-05 14 1.7 1 vda 11.0 gb 10
248 2023-03-05 88 3.5 1 vda 7.4 gb 17
249 2023-03-05 10 2.7 1 control 0.2 us 13
250 2023-03-05 26 5.4 1 vda 36.4 gb 19
252 2023-03-05 26 1.6 1 vda 5.2 gb 17
254 2023-03-05 14 28.8 15 vda      
255 2023-03-05 26 5.2 1 vda      
257 2023-03-05 19 5.6 4 vda      
258 2023-03-05 26 4.7 5 control 0.3 nl 12
259 2023-03-05 46 3.1 1 control 2.3 gb 12
260 2023-03-06 300 2.3 2 control 0.1 us 13
261 2023-03-06 25 5.5 1 vda 12.1 us 12
262 2023-03-06 25 2.7 4 control 0.8 ie 16
263 2023-03-06 25 1.7 1 vda 9.4 us 13
264 2023-03-06 25 5.1 4 control 0.2 us 16
265 2023-03-06 26 4.3 1 vda 14.0 us 16
266 2023-03-06 14 42.4 1 control 0.6 gb 16
267 2023-03-06 56 24.9 3 vda 7.7 us 23
268 2023-03-06 31 0.8 2 control 0.2 us 16
270 2023-03-06 28 1.7 1 control 0.8 gb 16
271 2023-03-06 4 2.7 2 vda 4.7 gb 18
272 2023-03-06 25 3.2 1 control 0.3 gb 13
273 2023-03-06 14 4.2 1 vda 15.4 us 16
274 2023-03-06 25 3.0 1 control 0.3 gb 21
275 2023-03-06 25 3.6 1 vda      
276 2023-03-06 1 4.5 2 vda 7.3 gb 19
277 2023-03-06 31 6.7 1 control 0.5 gb 11
278 2023-03-06 -34 2.4 1 vda 8.6 gb 13
(.sort-index (.value-counts (wc
  (ss subjects (& (.isin $tv [11 12]) (pd.notnull $time_puzzle_minutes)))
  (cbind $cond :did_v2 (pd.notnull $time_visit2_minutes)))))
cond did_v2 value
control False 1
control True 53
vda False 11
vda True 42

VDA properties

(setv sns (ssi subjects (&
  $round1
  (= $cond "vda"))))
(.sum (pd.concat :axis 1 [
  (getl (get ratings "round1") sns)
  (.drop (getl vda-props sns) :axis 1 ["n_questions" "n_criteria" "result_quality"])]))
I value
reversed_criterion 6
bad_alt_level 3
other_issue 2
dominator 13
dominated 19
unused_level 18
constant_criterion 12
vda_varied_all_cs 6
vda_redundant 8

The ratings categories are:

  • reversed_criterion: One of the criteria looks to have its levels in the wrong order.
  • bad_alt_level: One or more levels of the alts look to be set incorrectly, at least when taking the description into account. Possibly the subject changed the criteria after setting alt levels, triggering the alt levels to reset, but didn't notice and correct for it.

Idiosyncratic problems (other_issue):

  • Subject 219: Two alts appear to represent the same choice by the subject, and are distinguished only by possible outcomes.
  • Subject 248: Some of the criteria don't make sense for some of the alts.

Bad VDA subjects

Which subjects should be considered good enough for a main analysis of people who used the tool correctly enough? I'm going to exclude subjects who had any one of the three rated problems: reversed_criterion, bad_alt_level, or other_issue. The various automatically detected problems, like unused_level or varied_all_cs represent either inherent limitations of Artiruno (which ought to be reflected in my results) or less-than-optimal usage that shouldn't compromise results too much. The resulting sample in the VDA condition is:

(valcounts (np.where (ss subjects $round1 "bad_vda") "exclude" "include"))
I value
include 84
exclude 11

The specific subjects excluded are:

(pd.Series (ssi subjects (& $round1 $bad_vda)))
I value
0 162
1 171
2 172
3 198
4 215
5 219
6 226
7 246
8 248
9 265
10 273

I've archived this at https://web.archive.org/web/20230321/https://arfer.net/projects/artiruno/notebook#sec--bad-vda-subjects to show that I made this decision before rerunning the follow-up, and therefore before seeing the outcomes.

Planning for the follow-up

Let's send out the first invitations on April 3rd, which is 4 weeks after I ran the last subject in the analytic sample.

I need to ask each subject if they've made the decision and gotten to see some outcome before reinviting them. Let's do it by sending them a message on Prolific.

Send out only one message first. Then you can send them out in waves of 10 or 20 in subject-number order, ideally in the morning so you don't get questions while you're asleep.

Follow-up results

(rd (. (wcby
  (ss subjects (& $followed_up (bnot $bad_vda))
    ["cond" "eval_rate_easiness" "eval_rate_quality" "eval_rate_satisfaction" "eval_rate_vda_consistency" "eval_rate_vda_easiness" "eval_rate_vda_helpfulness"])
  $cond
  (pd.concat [
    (pd.Series [(. $ shape [0])] :index ["n"])
    (.mean $ :numeric-only T)])) T))
I control vda
n 32.000 23.000
eval_rate_easiness 2.656 2.522
eval_rate_quality 4.344 4.304
eval_rate_satisfaction 4.125 4.217
eval_rate_vda_consistency   4.043
eval_rate_vda_easiness   3.130
eval_rate_vda_helpfulness   3.739
(rd 2 (lfor
  vname ["eval_rate_easiness" "eval_rate_quality" "eval_rate_satisfaction"]
  :setv [lo hi] (scikits.bootstrap.ci
    (tuple (gfor
      cond ["control" "vda"]
      (ss subjects
        (& $followed_up (= $cond cond) (bnot $bad_vda))
        vname)))
    (fn [control vda] (- (np.mean vda) (np.mean control)))
    :multi "independent"
    :seed (int.from-bytes (.encode vname "ASCII") "big")
    :alpha .05 :n-samples 1,000,000)
  [vname lo hi]))
eval_rate_easiness -0.57 0.33
eval_rate_quality -0.45 0.32
eval_rate_satisfaction -0.43 0.58

Reminders for analysis

  • If subjects refresh the page and redo parts of the task, the timing data you get will only reflect the final attempt.

References

Ashikhmin, I., & Furems, E. (2005). UniComBOS—Intelligent decision support system for multi-criteria comparison and choice. Journal of Multi-Criteria Decision Analysis, 13(2-3), 147–157. doi:10.1002/mcda.380

Barbosa, P. A. M., Pinheiro, P. R., Silveira, F. R. V., & Filho, M. S. (2019). Selection and prioritization of software requirements applying verbal decision analysis. Complexity. doi:10.1155/2019/2306213

Edwards, W. (1977). How to use multiattribute utility measurement for social decisionmaking. IEEE Transactions on Systems, Man, and Cybernetics, 7(5), 326–340. doi:10.1109/TSMC.1977.4309720

Hodgett, R. E., Martin, E. B., Montague, G., & Talford, M. (2014). Handling uncertain decisions in whole process design. Production Planning and Control, 25(12), 1028–1038. doi:10.1080/09537287.2013.798706

Larichev, O. I. (2001). Ranking multicriteria alternatives: The method ZAPROS III. European Journal of Operational Research, 131(3), 550–558. doi:10.1016/S0377-2217(00)00096-5

Moshkovich, H. M., & Mechitov, A. I. (2018). Selection of a faculty member in academia: A case for verbal decision analysis. International Journal of Business and Systems Research, 12(3), 343–363. doi:10.1504/IJBSR.2018.10011350

Moshkovich, H., Mechitov, A., & Olson, D. (2016). Verbal decision analysis. In S. Greco, M. Ehrgott, & J. R. Figueira (Eds.), Multiple criteria decision analysis (2nd ed., pp. 605–636). New York, NY: Springer. ISBN 978-0-387-23081-8. doi:10.1007/978-1-4939-3094-4_15

Shevchenko, G., Ustinovichius, L., & Walasek, D. (2019). The evaluation of the contractor's risk in implementing the investment projects in construction by using the verbal analysis methods. Sustainability, 11(9). doi:10.3390/su11092660

Tamanini, I., & Pinheiro, P. R. (2011). Reducing incomparability in multicriteria decision analysis: An extension of the ZAPROS method. Pesquisa Operacional, 31, 251–270. doi:10.1590/S0101-74382011000200004