Research projects

About this page • See also my CV (PDF)


Publication (HTML, PDF): Arfer, K. B. (2024). Artiruno: A free-software tool for multi-criteria decision-making with verbal decision analysis. Journal of Multi-Criteria Decision Analysis, 31. doi:10.1002/mcda.1827
Web interface
VDA codeTask code
Source package

Artiruno is a tool for conducting verbal decision analysis (VDA). In an experiment on human subjects, I found that ordinary people could use Artiruno for making difficult decisions in their own lives, and they found it helpful, but Artiruno seemed to have little influence on their decisions or outcomes.


Publication (HTML, PDF): Carrión, D., Colicino, E., Pedretti, N. F., Arfer, K. B., Rush, J., DeFelice, N., & Just, A. C. (2021). Neighborhood-level disparities and subway utilization during the COVID-19 pandemic in New York City. Nature Communications, 12. doi:10.1038/s41467-021-24088-7
Analysis code

This study examined the relationship between measures of "social disadvantage", such as income and unemployment, and COVID-19 infections and deaths. Most of the work I did was to process New York City Subway turnstile data, which we used to estimate how well people could isolate themselves at the height of the pandemic.


Publication (HTML, PDF): Gutiérrez-Avila, I., Arfer, K. B., Carrión, D., Rush, J., Kloog, I., Naeger, A. R., Grutter, M., Páramo-Figueroa, V. H., Riojas-Rodríguez, H., & Just. A. C. (2022). Prediction of daily mean and one-hour maximum PM2.5 concentrations and applications in Central Mexico using satellite-based machine-learning models. Journal of Exposure Science and Environmental Epidemiology, 32, 917–925. 10.1038/s41370-022-00471-4

We built a model to predict the mean PM2.5 concentration for every day on a subset of the 1-km grid we used in Mexico_temperature.


Publication: Carrión, D., Arfer, K. B., Rush, J., Dorman, M., Rowland, S. T., Kioumourtzoglou, M.-A., Kloog, I., & Just, A. C. (2021). A 1-km hourly air-temperature model for 13 northeastern U.S. states using remotely sensed and ground-based measurements. Environmental Research, 200. doi:10.1016/j.envres.2021.111477

We built a model to predict the mean temperature for each hour on a 1-km grid in the northeastern US.


Manuscript (, PsyArXiv): Arfer, K. B. (2021). The unpredictable Buridan's ass: Failure to predict decisions in a trivial decision-making task. doi:10.31234/
Task code
Source package

I examined how to predict people's decisions in an extremely bare-bones binary decision-making task, where there's nothing meaningful to distinguish the two options. In a sample of 200 such trivial choices from 100 subjects run on Mechanical Turk in 2019, I hoped to approach 100% accuracy; instead, I did at best modestly better than chance, no matter how I framed the problem. I concluded that whether it's fundamentally probabilistic or deterministic, there can be substantial noise in people's decisions, which can't readily be explained as order effects.


Publication (HTML, PDF): Gutiérrez-Avila, I., Arfer, K. B., Wong, S., Rush, J., Kloog, I., & Just, A. C. (2021). A spatiotemporal reconstruction of daily ambient temperature using satellite data in the Megalopolis of Central Mexico from 2003-2019. International Journal of Climatology, 41, 4095–4111. doi:10.1002/joc.7060
Data, predictions, and notebook

We built a model to predict the mean, maximum, and minimum temperature for each day on a 1-km grid covering Mexico City.


Publication: Just, A. C., Arfer, K. B., Rush, J., Dorman, M., Shtein, A., Lyapustin, A., & Kloog I. (2020). Advancing methodologies for applying machine learning and evaluating spatiotemporal models of fine particulate matter (PM2.5) using satellite data over large regions. Atmospheric Environment. doi:10.1016/j.atmosenv.2020.117649

We built a model to predict the mean PM2.5 concentration for every day on a 1-km grid in the northeastern US.


Source package (without raw data)

The mSTUDY comprises a cohort of homosexually active men in Los Angeles who have HIV or are at high risk for catching it, but don't use injection drugs. Brendan Quinn, Steve Shoptaw, and I looked at how drug use can predict risky sex and viral suppression.



The Zithulele Births Follow Up Study (ZiBFUS) is a study of new mothers in the Eastern Cape of South Africa. I have examined rates of depression and breastfeeding.


Publication: Rotheram-Borus, M. J., Arfer, K. B., Christodoulou, J., Comulada, W. S., Stewart, J., Tubert, J. E., & Tomlinson, M. (2019). The association of maternal alcohol use and paraprofessional home visiting with children's health: A randomized controlled trial. Journal of Consulting and Clinical Psychology, 87, 551–562. doi:10.1037/ccp0000408
Publication ( HTML, PLOS HTML, PDF): Arfer, K. B., O'Connor, M. J., Tomlinson, M., & Rotheram-Borus, M. J. (2020). South African mothers' immediate and 5-year retrospective reports of drinking alcohol during pregnancy. PLOS ONE. doi:10.1371/journal.pone.0231518
Source package (without raw data)

A study of new South African mothers testing a mentor-mother intervention. I examined questions relating to maternal drinking, obtaining negative associations with health outcomes in the first paper and at best weak associations in the second.


Presentation: "Comorbidity and medical costs among Medicare beneficiaries with HIV in California: A case study in predictive data analysis"
Publication: Zingmond, D., Arfer, K. B., Gildner, J., & Leibowitz, A. (2017). The cost of comorbidities in treatment for HIV/AIDS in California. PLOS ONE. doi:10.1371/journal.pone.0189392
Source package (without raw data)

We looked at how to predict health-care costs among Californians with HIV. In particular, we used 2010 Medicare and Medicaid claims data to see how the presence of comorbid conditions can predict per-patient costs for the same year. We found that comorbidity information could only slightly improve median absolute error in predicting inpatient costs ($6,555 to $6,165) and drug costs ($12,932 to $12,595), but there was a more substantial improvement for outpatient costs ($8,487 to $6,792).


Poster: Arfer, K. B., Tomlinson, M., Mayekiso, A., Bantjes, J., van Heerden, A., & Rotheram-Borus, M. J. (2017, June 25). Criterion validity of self-reports of alcohol, cannabis, and methamphetamine use among young men in Cape Town, South Africa. Poster session presented at the 40th Annual Meeting of the Research Society on Alcoholism, Denver, CO.
Publication (HTML, PDF): Arfer, K. B., Tomlinson, M., Mayekiso, A., Bantjes, J., van Heerden, A., & Rotheram-Borus, M. J. (2017). Criterion validity of self-reports of alcohol, cannabis, and methamphetamine use among young men in Cape Town, South Africa. International Journal of Mental Health and Addiction. Advance online publication. doi:10.1007/s11469-017-9769-4
Source package (without raw data)

This project is a clinical trial headed by Mary Jane Rotheram, "HIV & Drug Abuse Prevention for South African Men" ( identifier NCT02358226; Rotheram-Borus et al., 2016), which uses soccer and vocational training as tools for the improvement of health in Cape Town. My involvement so far has been to compare self-reports of drug use with breath and urine tests, using data from October 2015 to Jaunary 2017. Self-report appears to be reasonably accurate in this population.


Publication (HTML, PDF): Arfer, K. B., & Jones, J. J. (2018). American political-party affiliation as a predictor of usage of an adultery website. Archives of Sexual Behavior, 48, 715–723. doi:10.1007/s10508-018-1244-1
Media coverage
Source package

This project, a collaboration with Jason J. Jones, uses data from the 2015 leak of the adultery-focused dating website Ashley Madison. We matched up Ashley Madison users with voter-registration rolls from five US states. We found that people registered to vote as Libertarians (+0.46 logit units) or Republicans (+0.19) were more likely to have a matching Ashley Madison user account than unaffiliated voters, and people registered to vote as Democrats were less likely (−0.18 logit units).


Publication (HTML, PDF): Arfer, K. B. (2023). Pattern-setting for improving risky decision-making. Journal of the Experimental Analysis of Behavior, 119(1), 81–90. doi:10.1002/jeab.816
Presentation: "This particular cigarette won't kill me: Pattern-setting to increase self-control"
Task code
Source package

Howard Rachlin and I conducted this experiment in 2015, but I only wrote up a paper about it in 2022, after Rachlin's death in 2021. The study tested Rachlin's (2014) idea of "pattern-setting" as a strategy for improving self-control. Undergraduate subjects completed a timed risky-choice task similar to that of Luhmann, Ishida, and Hajcak (2011) under various conditions in which they were obliged to make the same choices they had made earlier. We hoped that such enforcement of choice patterns would lead subjects to make decisions with better long-term consequences, the idea being that they would learn that each decision affects not only the current trial but also one or more other trials in the future.


Thesis (HTML, PDF): "Predicting outcomes of interventions to increase social competence in children and adolescents"
Source package (without raw data)

In this project, which was my PhD thesis and a spiritual successor to Rickrack, I did some new analyses of data collected by Matt Lerner and colleagues examining various treatments for social deficits in autistic children. My goal was to examine how individual treatment outcomes could be predicted on the basis of pretreatment variables. The answer was, basically, they couldn't. Despite lots of variability in the treatment outcomes and the wide variety of pretreatment variables available to the models, adding pretreatment variables to base models didn't meaningfully improve predictive accuracy. These findings, along with those of Rickrack, make me think that radical changes to psychological measurement may be necessary to predict behavior.


Publication: Rachlin, H., Arfer, K. B., Safin, V., & Yen, M. (2015). The amount effect and marginal value. Journal of the Experimental Analysis of Behavior, 104(1), 1–6. doi:10.1002/jeab.158
Task code
Source package

A collaboration with Howard Rachlin, Vasiliy Safin, and Ming Yen. The data is from September 2014. We sought to show amount effects, the tendency for people to be less deterred by delay or probabilistic uncertainty for larger rewards, in a task with absolute judgments rather than choices. These absolute judgments took the form of subjects drawing a line to indicate how happy they would be to receive a given reward. We found that subjects' stated happiness diminished less with delay for larger rewards, but we found no such amount effect for probabilistic uncertainty.


Publication: Rachlin, H., Safin, V., Arfer, K. B., & Yen, M. (2015). The attraction of gambling. Journal of the Experimental Analysis of Behavior, 103(1), 260–266. doi:10.1002/jeab.113

An invited theory paper by Howard Rachlin that recapitulates and expands upon Rachlin (1990). Vasiliy Safin, Ming Yen, and I helped with revision. The central idea of the paper is that gamblers treat repeated games as strings of losses terminated by wins, and treat the length of each string (that is, the number of losses before each win) like a delay. This thinking implies a natural connection between time preferences (that is, delay discounting) and gambling behavior.


Task code
Source package

A collaboration with Vasiliy Safin and Howard Rachlin. Inspired by Attraction, we had subjects play a simple gambling game with a real spinner, and also complete delay- and risk-discounting tasks on a computer. In the pilot data from the summer of 2014, the observed correlations are close to 0.


Publication: Safin, V., Arfer, K. B., & Rachlin, H. (2015). Reciprocation and altruism in social cooperation. Behavioural Processes, 116, 12–16. doi:10.1016/j.beproc.2015.04.009
Task code
Source package

A collaboration with Vasiliy Safin and Howard Rachlin. We examined how rewards to the opponent affect choices in a non-strategic version of the prisoner's dilemma, in which the opponent's choices are fixed and known to the subject. As we'd expected, subjects were more likely to cooperate when they were told the opponent would be rewarded. Part of the data is from November 2013 and part is from February and March of 2014.


Publication ( HTML, Frontiers HTML, PDF): Arfer, K. B., & Luhmann, C. C. (2017). Time-preference tests fail to predict behavior related to self-control. Frontiers in Psychology, 8(150). doi:10.3389/fpsyg.2017.00150
Task code
Source package

The initial goal of this project was to establish the quality of a time-preference test so it could be used in other, more substantive projects like Brass. We examined the retest reliability and convergent reliability of three tests, but paid special attention to predictive validity for real-world self-control behavior such as smoking. The data was collected in early 2014 on Mechanical Turk. Despite reasonable retest reliability and convergent reliability, predictive validity was poor. We did a Study 2 using data from the National Longitudinal Survey of Youth 1979 and again found low predictive accuracy for criterion variables. It seems that time preferences are not promising for further work on the prediction of self-control behavior. One of my chief concerns now is how to do better prediction of real-world decisions.


Publication: Rodriguez-Seijas, C., Arfer, K. B., Thompson, R. G., Hasin, D. S., & Eaton, N. R. (2017). Sex-related substance use and the externalizing spectrum. Drug and Alcohol Dependence, 174, 39–46. doi:10.1016/j.drugalcdep.2017.01.008
Manuscript: "The organization of sexual preferences"
Manuscript: "Evolutionary gender differences in sexual preferences"
Task code
Source package

There are a lot of things that people can be sexually attracted to: to particular sexual acts, to the ages of potential partners, to partner personalities, or even to moods or themes, such as vulnerability or purity. But most research on sexual preferences exclusively considers preferences for partner gender. In this project, we wish to discover how sexual preferences of all kinds are related, with minimal a priori theoretical commitments to particular ways of thinking about sexual preferences. The basic strategy is to give subjects a diverse self-report inventory, then examine how the items predict each other. We have about 1,000 subjects' worth of data from Mechanical Turk, collected in April, May, and July of 2014.

Empirical Sexual Attitudes

Text of the book: HTML, print

A book on the psychology of sexuality that is (mostly) a literature review rather than a report of new data or new analyses. See its entry on my writing page.


Source package

An experiment I conducted in the summer of 2013 with Yoni Donner through his website, Quantified Mind, but aborted because of the findings of Rickrack. Subjects did a few tests either immediately before or immediately after lunch for several days. We intended thus to manipulate hunger. The key question was how hunger affects intertemporal choice, particularly patience.


Task and server code
Source package

A collaboration with Elizabeth Trimber. We collected some data in the summer of 2013, but the project is now aborted because of the findings of Rickrack. We intended to look at how well various pretests, including behavioral-economic intertemporal-choice tasks as well as more personality-psychological measures like a conscientiousness test, can predict people's real-life self-control failures. The notion of self-control failure we were interested in was dynamic inconsistency: that is, reneging. We solicited subjects' commitments about things they intend to spend certain amounts of time on over the next two weeks, then had them check in daily to see if they follow through. We also gave special attention to wakeup times, since snooze buttons are a familiar source of self-control problems.


Source package

An ongoing collaboration with Christian Luhmann. Luhmann's idea is to examine how well different models of intertemporal choice can mimic each other, especially as these models are augmented with nonlinear (and more psychologically plausible) representations of time and value.


Task code
Source package

A project inspired by my idea of vicarious restraint. Its most recent form concerns how exercising censorship changes one's perceptions of vulnerability to media effects. The existing data, collected in January and February of 2013, was not supportive of the idea of vicarious restraint. I'm unsure how and whether to continue.


Publication ( HTML, Frontiers HTML, PDF): Arfer, K. B., Bixter, M. T., & Luhmann, C. C. (2015). Reputational concerns, not altruism, motivate restraint when gambling with other people's money. Frontiers in Psychology, 6(848). doi:10.3389/fpsyg.2015.00848
Task code and handouts
Source package

A collaboration with Mike Bixter following up on two earlier studies he conducted on social distance and moral hazard (Bixter & Luhmann, 2014). The data for Study 2 was collected in February, March, and April of 2013, whereas the data for Study 1 was collected later, in October 2014. "Moral hazard" (more specifically, "indirect moral hazard" or "morale hazard") refers to people's willingness to take bigger risks when a third party will pay part of the price if the gamble goes badly; for example, people with flood insurance have less incentive to protect their house against flooding. In this project, as in Bixter and Luhmann (2014), subjects were offered a series of gambles. Each gamble had a chance of causing the subject to lose money, but for some gambles ("shared gambles"), half of any loss would be inflicted on a specified third party instead of the subject. Thus, by taking shared gambles, subjects could endanger other people to maximize their winnings.

In Study 1, we found that warning subjects that their partner would see what shared gambles they'd taken decreased subjects' willingness to take such gambles. That is, people take less risks under moral hazard when their actions are public to the person who's money they're gambling with.

In Study 2, we found that subjects were less willing to endanger another student they'd had a brief in-person conversation with (the "local partner") than to endanger an unseen and unnamed third party (the "remote partner"). However, when subjects could endanger the local partner while disguising themselves as the remote partner, they became at least as risk-seeking as when they could endanger the remote partner. Thus, their relative reluctance to endanger the local partner could not have been caused by genuine regard for the local partner's welfare. The real motive, presumably, was reputational concerns.


Publication (HTML, PDF): Arfer, K. B., & Luhmann, C. C. (2015). The predictive accuracy of intertemporal-choice models. British Journal of Mathematical and Statistical Psychology, 68(2), 326–341. doi:10.1111/bmsp.12049
Code: TaskModel-fitting server
Source package

This project, my master's thesis, was a comparison of the predictive accuracy of ten models of intertemporal choice (some psychologically plausible and some not), with Christian Luhmann. I collected the data in November and December of 2012 and January of 2013, mostly on Mechanical Turk. The differences in predictive accuracy between models were surprisingly small. Apparently, 85% is about the best accuracy achievable (i.e., the Bayes error rate is 15%) in this sort of intertemporal-choice task. The paper ends up recommending one of the simplest models I tested, which is logistic regression on the difference of reward amounts and the difference of delays.


Publication: Pegg, S., Arfer, K. B., & Kujawa, A. (2021). Altered reward responsiveness and depressive symptoms: An examination of social and monetary reward domains and interactions with rejection sensitivity. Journal of Affective Disorders, 282, 717–725. doi:10.1016/j.jad.2020.12.093
Publication: Kujawa, A., Arfer, K. B., Finsaas, M. C., Kessel, E. M., Mumper, E., & Klein, D. N. (2020). Effects of maternal depression and mother-child relationship quality in early childhood on neural reactivity to rejection and peer stress in adolescence: A 9-year longitudinal study. Clinical Psychological Science, 8, 657–672. doi:10.1177/2167702620902463
Publication: Rappaport, B. I., Hennefield, L., Kujawa, A., Arfer, K. B., Kelly, D., Kappenman, E. S., Luby, J. L., & Barch, D. M. (2019). Peer victimization and dysfunctional reward processing: ERP and behavioral responses to social and monetary rewards. Frontiers in Behavioral Neuroscience, 13. doi:10.3389/fnbeh.2019.00120
Publication: Babinski, D. E., Kujawa, A., Kessel, E. M., Arfer K. B., & Klein, D. N. (2019). Sensitivity to peer feedback in young adolescents with symptoms of ADHD: Examination of neurophysiological and self-report measures. Journal of Abnormal Child Psychology, 47, 605–617. doi:10.1007/s10802-018-0470-2
Publication: Ethridge, P., Kujawa, A., Dirks, M. A., Arfer, K. B., Kessel, E. M., Klein, D. N., & Weinberg, A. (2017). Neural responses to social and monetary reward in early adolescence and emerging adulthood. Psychophysiology, 54, 1786–1799. doi:10.1111/psyp.12957
Publication: Kujawa, A., Kessel, E. M., Carroll, A., Arfer, K. B., & Klein, D. N. (2017). Social processing in early adolescence: Associations between neurophysiological, self-report, and behavioral measures. Biological Psychology, 128, 55–62. doi:10.1016/j.biopsycho.2017.07.001
Publication: Kujawa, A., Arfer, K. B., Klein, D. N., & Proudfit, G. H. (2014). Electrocortical reactivity to social feedback in youth: A pilot study of the Island Getaway task. Developmental Cognitive Neuroscience, 10, 140–147. doi:10.1016/j.dcn.2014.08.008
Task code

A collaboration with Autumn Kujawa and a number of other investigators. The task is an elaborate reality-show–like game for studying the electroencephalography of social rejection. In our pilot study of the task, we examined not just EEG responses to rejection, but also how these were related to subjects' voting decisions in the game, and to self-report measures of depression and social anxiety. Greater neural sensitivity to feedback was associated with fewer votes to reject other players and with more symptoms of anxiety and depression. The task has gone on to be used in several labs throughout the US.


OpenDocuments ("Cake that good is risky to refuse: Motivated reasoning in motivationally charged decisions"): ManuscriptSupplement

A private project I conducted on Mechanical Turk in the summer of 2012. The goal was to refine a manipulation of showing people a picture of a dessert, which I'd tried in a previous project. This manipulation was supposed to influence choice in a risky decision related to the dessert portrayed. I indeed was able to find a picture that was sufficiently appealing and a task in which not only decisions but also probability perceptions could be influenced by the picture.


Task code
Source package

A simple project to collect ratings of photographs for use as stimuli in later research. The first study, run in May 2012, used pictures of desserts, and the second, run in June 2012, used semi-nude pictures of women. I ran both studies on Mechanical Turk.


Paper: "Intuitive decision-making as social prediction: The similar-strategy hypothesis"
Source package (version released 11 August 2012)

This project, my senior thesis at Allegheny College, was the first study I proposed, as a junior in the spring of 2010, but the second I actually conducted (after Halfhard). My mentor was Robert Hancock.

I was interested in the difference between intuitive and deliberative decision-making, operationalized as simply whether or not one wrote down reasons for deciding one way or the other before making one's decision, in the style of Wilson and Schooler (1991). I had the idea that the apparent superiority of intuitive strategies in Wilson and Schooler and in part of McMackin and Slovic (2000) resulted from how, in one sense or another, decision quality was being measured by reference to someone else's decisions. This led to the "similar-strategy hypothesis": that when you're trying to guess somebody else's judgment of something, you'll do better the more your strategy of judgment resembles theirs. So I had two groups of subjects do McMackin and Slovic's numerical-estimation task (which involves answering questions like "How long is the Amazon River?"), one deliberative and one intuitive, and I had another two groups of subjects divided the same way explicitly try to guess other subjects' responses to the questions rather than the correct answers themselves.

I got no consistent differences between conditions. I didn't even replicate the findings of McMackin and Slovic (2000) within the slice of the design that was a mostly literal replication. And this slice was too small, with 20 intuitive and 18 deliberative subjects, compared to 74 intuitive and 69 deliberate in McMackin and Slovic, for this failure to replicate to be all that meaningful.

I still have a certain fondness for the similar-strategy hypothesis, so I may give this line of work another try someday.


PDFs ("The effect of perceived difficulty on perceptual learning"): PaperPresentationPoster
Source package (version released 11 August 2012)

I conducted this study as part of the Research Experience for Undergraduates program at the University of Minnesota in the summer of 2010. My mentor was Paul Schrater.

The study was a social-psychological spin-off of a motion-learning experiment with the goal of seeing how explicitly provided information about task difficulty could affect performance and learning on such a seemingly low-level task. Subjects had to learn the movement patterns of dot clouds and catch them by moving the mouse. The difficulty-information manipulation was just that some subjects were told, midway through the experiment, "Note that this [next] task is more difficult than the first."

I reported that this difficulty warning increased learning, as I had expected, but in retrospect, I don't think I did a very good job of analyzing the data. Not that there was much I could do with the sample size of 9 and the 2 × 2 between-subjects design. Alas, I realized all too late that summer that subjects were coming in at such a trickle that if I didn't swallow my pride and go knocking on dormitory doors like a vacuum-cleaner salesman, I'd never get a second subject. At least I got nine.


Bixter, M. T., & Luhmann, C. C. (2014). Shared losses reduce sensitivity to risk: A laboratory study of moral hazard. Journal of Economic Psychology, 42, 63–73. doi:10.1016/j.joep.2013.12.004

Luhmann, C. C., Ishida, K., & Hajcak, G. (2011). Intolerance of uncertainty and decisions about delayed, probabilistic rewards. Behavior Therapy, 42, 378–386. doi:10.1016/j.beth.2010.09.002. Retrieved from

McMackin, J., & Slovic, P. (2000). When does explicit justification impair decision making? Applied Cognitive Psychology, 14(6), 527–541. doi:10.1002/1099-0720(200011/12)14:6<527::AID-ACP671>3.0.CO;2-J

Rachlin, H. (1990). Why do people gamble and keep gambling despite heavy losses? Psychological Science, 1(5), 294–297. doi:10.1111/j.1467-9280.1990.tb00220.x

Rachlin, H. (2014, July 24). If you're so smart, why aren't you happy? Retrieved from

Rotheram-Borus, M. J., Tomlinson, M., Durkin, A., Baird, K., DeCelles, J., & Swendeman, D. (2016). Feasibility of using soccer and job training to prevent drug abuse and HIV. AIDS and Behavior, 20(9), 1841–1850. doi:10.1007/s10461-015-1262-0

Wilson, T. D., & Schooler, J. W. (1991). Thinking too much: Introspection can reduce the quality of preferences and decisions. Journal of Personality and Social Psychology, 60(2), 181–192. doi:10.1037/0022-3514.60.2.181