Rickrack notebook

Created 2 Nov 2013 • Last modified 7 Jul 2015

MTurk study

Goal

To establish the test–retest (and, when possible, parallel-forms) reliability and the criterion validity of econometric tests, particularly tests of patience, stationarity, risk aversion, and loss aversion. Well, maybe including all four would be biting off more than I'd care to chew, but I am at least interested in the first two. Brass and Hazel need reliable tests, particularly so that negative results can be interpretable.

A reasonable way to assess reliability under my predictivist philosophy is just to try to predict scores with themselves.

Basic plan

  • Use Tversky so both MTurk and the lab are usable
  • Administer tests twice at time A and only once at time B; use the difference between the first two for parallel-forms reliability and the difference between A and B for test-retest reliability (possibly intermingled with parallel-forms, of course)
  • Also measure criterion variables at time A (criterion variables are important if only so we have a way to compare tests that use model-heavy and model-light techniques, and therefore broader and narrower operationalizations of the construct being measured)

Finally, Christian points out that it would be nice to demonstrate that patience in immediate-versus-delayed dilemmas is less stable over time than patience in delayed-versus-delayed. The theory of construal level and psychological distance predicts that one should adopt a more concrete mindset in immediate-versus-delayed situations. Perhaps this more concrete mindset would lead to preferences that are more influenced by short-term concerns and therefore less stable over time. Then again, you wouldn't need such fancy concepts as construal level and psychological difference to explain this: people should indeed be more influenced by present circumstances when making a decision involving the present.

Previous work

(What tests already exist? What reliabilities and validities have been demonstrated for exactly which tests?)

Going by Schuerger, Tait, and Tavernelli (1982), personality tests have reliability coefficients of roughly .8 for 1–2 weeks and .6 for 1 year.

  • Intertemporal choice
    • Reliability (how exhaustive can this be made?)
      • Kirby, Petry, and Bickel (1999): Used a relatively coarse 7-trial forced-choice patience test for each of three reward magnitudes, yielding three ln(k)s. Immediate SSes only. Heroin addicts had higher ln(k)s (i.e., were less patient) than controls (but the analysis seems to have ignored the triplet grouping of the ln(k)s). Correlation of the ln(k)s with intelligence (as measured by the WAIS) was small.
      • Kirby (2009): Used the same patience test as Kirby et al. (1999). The reliability correlation of mean ln(k) (mean across the three reward magnitudes) was .77 (95% CI [.67, .85]) for 5 weeks and .71 ([.5, .84]) for 1 year.
      • Beck and Triplett (2009): Subjects selected from a list the immediate equivalent to $1,000 at six different delays. Then hyperbolic ks were (somehow) fit. For the 85% of subjects who discounted in an orderly fashion, ln(k) correlated .64 over a 6-week interval.
      • Smith and Hantula (2008): A forced-choice patience test and a free-response patience test yielded ks correlated at .75, but AUCs correlated at .33.
    • Validity (we will of course cite only examples)
      • Madden, Petry, Badger, and Bickel (1997): A stepwise procedure finding x such that $x immediately is equivalent to $1,000 at each of six delays. Then a hyperbolic function was fit by least squares and the k extracted. Heroin addicts had higher ks than controls. ks correlated .4 with scores on the Impulsivity subscale of the Eysenck Personality Questionnaire.
      • Sutter, Kochaer, Rützler, and Trautmann (2010): A paper-and-pencil faux-stepwise procedure with several different delay and amount conditions, including non-immediate SSDs. Variables concerning drug use, BMI, saving, and grades are included, but the numerical results are a bit hard to digest.
      • Chabris, Laibson, Morris, Schuldt, and Taubinsky (2008): Used Kirby et al.'s (1999) patience-test items, but estimated discounting rate using the MLE of a parameter in a logistic model. Correlations with various individual "field behaviors" are low, and with aggregate indices of field behaviors range from .09 to .38.
      • Wulfert, Block, Ana, Rodriguez, and Colsman (2002): A single real choice between $7 today and $10 in a week. Choice was correlated in the expected directions with being a problem student, GPA, and drug use from .52 to .68.
  • Loss aversion
  • Risk aversion
    • Weber and Chapman (2005) found that people will take riskier gambles with smaller amounts of money (the "peanuts effect")

One would like tests that are repeatable (i.e., tests that can be re-administered very soon after; i.e., tests for which remembering one's past answers is not a problem), so they can be used as state measures. Possibly many existing tests meet this criterion despite using a fixed item set, since giving people the same item twice in past intertemporal-choice studies has frequently revealed them to not be perfectly consistent.

One can use already existing patience tests, particularly that of Kirby et al. (1999). The reliability correlations, although less than one might want, don't have much room for improvement, after all. However, this test doesn't measure stationarity, it doesn't consider nonzero SSDs, it's coarse, and it's questionably repeatable. These qualities may or may not be bad. They might end up making the test better at predicting theoretically patience-relevant criterion variables. But that's an empirical question, and you certainly wouldn't these qualities a priori. So let's at least consider some other tests.

Theory and nonzero SSDs

Can the idea of temporal discounting accommodate nonstationarity in the wrong direction (i.e., becoming more patient as front-end delays increase)? I think it must. A test of stationarity is then a test of "relative future patience", or inversely "immediacy bias", which can rate subjects as nonstationarity in the classical direction, nonstationary in the wrong direction, or perfectly stationary.

The distinction between immediate–delayed versus delayed–delayed choices can be thought of like this:

  • Immediate vs. delayed: choices under the gun
  • Delayed vs. delayed: plans for the future

It looks like I'm going to measure nonstationarity using two discount factors, one immediate-versus-delayed and one delayed-versus-delayed. How should it be computed? Let's say as the difference. Then if I want to include both patience and nonstationarity as predictors in a regression model, I can just include one term for each discount factor: the model μ = b0 + b1ID + b2DD is just a reparametrization of μ = b0 + b1ID + b2(DD − ID). Since I don't believe in interpreting model parameters, the difference between these models is immaterial.

Can we use established techniques of CAT?

(Short answer: apparently not if we want our tests to be easy to implement and require low resources.)

Okay, suppose we want to use the minimum expected posterior variance criterion (equation 1.26 of van der Linden & Glas, 2010), and we're using a one-parameter binary-choice model. Then for each round we need to find a new item Y minimizing

p(Y = 1 | x) * Var(θ | x, Y = 1) +  p(Y = 0 | x) * Var(θ | x, Y = 0)

where x is the data already obtained. Let us suppose our model is

Y ~ Bern(ilogit(ρ (v_{30} * lr - sr)))

and we're using some fixed value of ρ instead of treating it as a free parameter. Rename v30 to θ. If we use a uniform prior for θ, the posterior is proportional to the likelihood.

In the case of multiple trials, just take the log of each and sum them all up, then exponentiate.

I don't think there are closed forms for p(Y = 1 | x) and Var(θ | x, Y = 1), because there's no conjugate prior for θ. And for a uniform prior, it doesn't seem like a closed form for the mean exists once we have data from more than one trial.

Moreover, grid sampling alone doesn't seem like it can help, because we would need to do a full grid sample or three for every candidate Y.

Tests

(Christian suggests it would be interesting to compare actual discount factors between the tests.)

  • Intertemporal choice
    • 20 rounds of probabilistic bisection search (Horstein, 1963; Waeber, 2013, p. 14) with pc = .75. What is being searched for is the one-month discount factor, so each trial is a forced choice between an immediate reward and a 30-day-delayed reward. The ratio between these rewards is what is manipulated adaptively; the absolute amounts can be varied more or less arbitrarily.
    • Bisection search again, but with a one-month front-end delay added to both delays.
    • A non-adaptive free-response task, where the subject types in an indifference amount (for immediate versus month-delayed). Mike Bixter says subjects find filling in the LLR easier, but beware that (going by the results of Builder) Turkers tend to be patient. The fixed amount can be determined randomly, and the discount factor can be estimated as the mean of the ratios.
    • Free-response again, but with a one-month front-end delay.
    • The test of Kirby (2009) (plus front-end delays?)
  • Criterion (partly from Chabris et al., 2008) - see the source code for details

TV 1, 2

I've done a bit of a trial run with just bisection search.

rd(transform(
    dcast(
        ss(itb.discs, sb[char(s), "tv"] %in% 1:2),
        s + type ~ round,
        value.var = "disc"),
    change = round2 - round1))
  s type round1 round2 change
1 s0001 near 0.651 0.597 -0.053
2 s0001 far 0.599 0.518 -0.081
3 s0003 near 0.292 0.622 0.330
4 s0003 far 0.712 0.789 0.077
5 s0004 near 0.994 0.994 0.000
6 s0004 far 0.994 0.994 0.000
7 s0005 near 0.867 0.916 0.049
8 s0005 far 0.868 0.783 -0.085
9 s0006 near 0.752 0.766 0.015
10 s0006 far 0.745 0.618 -0.127
11 s0007 near 0.292 0.440 0.148
12 s0007 far 0.556 0.442 -0.113
13 s0008 near 0.844 0.974 0.131
14 s0008 far 0.982 0.980 -0.002
15 s0010 near 0.959 0.994 0.034
16 s0010 far 0.981 0.994 0.013
17 s0012 near 0.994 0.994 0.000
18 s0012 far 0.994 0.994 0.000
19 s0013 near 0.500 0.534 0.034
20 s0013 far 0.594 0.525 -0.068
21 s0014 near 0.810 0.908 0.098
22 s0014 far 0.879 0.890 0.011
23 s0016 near 0.625 0.615 -0.010
24 s0016 far 0.519 0.509 -0.010
25 s0017 near 0.796 0.880 0.084
26 s0017 far 0.835 0.799 -0.036
27 s0018 near 0.672 0.597 -0.075
28 s0018 far 0.738 0.578 -0.160

near is immediate versus 1 month, and far is 1 month versus 2 months. round1 and round2 are the discount factors resulting from the first and second rounds (respectively) of the given test.

The discount factors themselves generally look okay, except for those annoying subjects who chose LL every time. Let's graph the changes from time 1 to time 2.

discs.tv2 = ss(itb.discs,
    sb[char(s), "tv"] %in% 1:2 &
    sb[char(s), "itb.caught.s1"] == 0 &
    sb[char(s), "itb.ll.s1"] < 1)
dodge(type, round2 - round1, data = dcast(
    discs.tv2,
    s + type ~ round,
    value.var = "disc")) + boxp

tv2-itb-retest-diffs.png

rd(with(dcast(discs.tv2, s + type ~ round, value.var = "disc"),
    qmean(abs(round2 - round1))))
  value
lo 0.006
mean 0.065
hi 0.154

Looks like there's a trend for scores on the two tests to drift in opposite directions. Rummy, but I have a small sample, anyway. More important is to note that changes in the discount factor generally don't exceed .1. That's good.

How do discount factors (and changes in discount factors) change if we use only 10 (non-catch) trials per round? Can we safely use just 10 trials?

rd(with(
    ss(merge(itb.discs, itb.discs10, by = qw(s, type, round)),
        sb[char(s), "tv"] %in% 1:2 &
        sb[char(s), "itb.caught"] == 0 &
        sb[char(s), "itb.ll"] < 1),
    qmean(abs(disc.x - disc.y))))
  value
lo 0.002
mean 0.051
hi 0.140

.05 units seems like too large an increase in error to accept, since it's on the order of the immediate-retest error for the full, 20-trial estimate.

Notice that the median of obtained discounts is about .75:

dodge(type, disc, data = discs.tv2) + boxp +
    coord_cartesian(ylim = c(0, 1.05))

tv2-itb-discs.png

So, I might as well set the prior density in the bisection test to have median .75 instead of .5. For simplicity, it can remain uniform on either side of .75.

Here are the criterion variables:

ss(sb, tv %in% 1:2)[criterion.vs]
  bmi tobacco cigpacks gambling exercise healthy.meals floss cc.latefees cc.subpay savings
s0001 22.95737 TRUE 1 2 10 0.60 0 0 0.75 0.05
s0003 24.03070 FALSE   2 25 0.70 7 0 0.12 0.10
s0004 27.89405 FALSE   0 20 0.75 0 0 0.00 0.20
s0005 26.13806 FALSE   0 16 0.90 0 0 0.00 0.05
s0006 19.34512 FALSE   0 9 0.70 2 1 0.01 0.00
s0007 27.97930 FALSE   0 8 0.70 4 0 0.00 0.00
s0008 24.37399 FALSE   0 70 0.90 3 3 1.00 0.00
s0010 24.10524 FALSE   0 15 0.95 7 0 0.00 0.30
s0012 23.40337 FALSE   0 8 0.90 9 0 0.00 0.80
s0013 20.94104 TRUE 7 0 10 0.10 0 0 0.48 0.00
s0014 27.12460 FALSE   0 5 0.80 3 0 0.50 0.20
s0016 18.98722 FALSE   0 0 0.10 0 0 0.99 0.00
s0017 24.67918 FALSE   0 5 0.05 0 1 0.00 0.20
s0018 28.40975 TRUE 1 1 3 0.90 0 0 0.24 0.05

TV 3, 4

This time I used just free-response matching.

rd(transform(
    dcast(
        ss(itm.discs, sb[char(s), "tv"] %in% 3:4),
        s + type ~ round,
        value.var = "disc"),
    change = round2 - round1))
  s type round1 round2 change
1 s0020 near 0.673 0.692 0.018
2 s0020 far 0.738 0.655 -0.082
3 s0021 near 0.033 0.033 0.000
4 s0021 far 0.500 0.500 0.000
5 s0022 near 0.831 0.837 0.006
6 s0022 far 0.829 0.811 -0.019
7 s0023 near 0.913 0.921 0.007
8 s0023 far 0.908 0.894 -0.015
9 s0024 near 0.546 0.558 0.011
10 s0024 far 0.569 0.465 -0.105
11 s0025 near 0.913 0.861 -0.052
12 s0025 far 0.903 0.836 -0.067
13 s0026 near 0.843 0.910 0.067
14 s0026 far 1.000 1.000 0.000
15 s0028 near 0.815 0.859 0.044
16 s0028 far 0.908 0.733 -0.174
17 s0029 near 0.597 0.669 0.072
18 s0029 far 0.333 0.541 0.208
19 s0030 near 0.809 0.776 -0.033
20 s0030 far 0.859 0.800 -0.059
21 s0031 near 0.450 0.463 0.013
22 s0031 far 0.488 0.472 -0.016
23 s0032 near 0.500 0.505 0.005
24 s0032 far 0.500 0.504 0.004
25 s0034 near 0.262 0.411 0.149
26 s0034 far 0.466 0.437 -0.029
27 s0035 near 0.357 0.273 -0.084
28 s0035 far 0.187 0.227 0.040
discs.tv4 = ss(itm.discs,
    sb[char(s), "tv"] %in% 3:4 &
    sb[char(s), "itm.backwards.s1"] == 0)
dodge(type, round2 - round1, data = dcast(
    discs.tv4,
    s + type ~ round,
    value.var = "disc")) + boxp

tv4-itm-retest-diffs.png

rd(with(dcast(discs.tv4, s + type ~ round, value.var = "disc"),
    qmean(abs(round2 - round1))))
  value
lo 0.004
mean 0.057
hi 0.188

The opposite-direction drift pattern shows up again, but now the error is, if anything, smaller. Nice.

Again, let's ask how estimates would change if we halved the number of trials.

rd(with(
    ss(merge(itm.discs, itm.discs5, by = qw(s, type, round)),
        sb[char(s), "tv"] %in% 3:4 &
        sb[char(s), "itm.backwards.s1"] == 0),
    qmean(abs(disc.x - disc.y))))
  value
lo 0.000
mean 0.019
hi 0.060

That's actually not a bad increase in error considering how quick the test becomes if it's shortened to 5 trials. What about if we use only the first trial?

rd(with(
    ss(merge(itm.discs, ss(itm, tn == 1), by = qw(s, type, round)),
        sb[char(s), "tv"] %in% 3:4 &
        sb[char(s), "itm.backwards.s1"] == 0),
    qmean(abs(ssr/equiv - disc))))
  value
lo 0.000
mean 0.067
hi 0.299

Yeah, I didn't think that would work as well. Now let's look at the retest errors for the 5-trial version.

dodge(type, round2 - round1, data = dcast(
    ss(itm.discs5,
        sb[char(s), "tv"] %in% 3:4 &
        sb[char(s), "itm.backwards.s1"] == 0),
    s + type ~ round,
    value.var = "disc")) + boxp

tv4-itm5-retest-diffs.png

rd(with(
    dcast(
        ss(itm.discs5, sb[char(s), "tv"] %in% 3:4 & sb[char(s), "itm.backwards.s1"] == 0),
        s + type ~ round,
        value.var = "disc"),
    qmean(abs(round2 - round1))))
  value
lo 0.008
mean 0.063
hi 0.172

Not as good, but it's hard to say how much worse, or if the difference is meaningful. It isn't obviously better or worse than fig--g/tv2-itb-retest-diffs.

dodge(type, disc, data = discs.tv4) + boxp +
    coord_cartesian(ylim = c(0, 1.05))

tv4-itm-discs.png

Very similar to last time, with perhaps a bit more variability.

Planning for the real thing

There are six intertemporal-choice tests, namely, one immediate–delayed version and one delayed–delayed version of bisection, matching, and Kirby-style fixed forced-choice.

This last kind of test, which I haven't tried out yet, may as well use the medium reward size (since this is perhaps the most reliable, going by table 2 of Kirby, 2009). The front-end delay we add could be 30 days for consistency with the other tests.

  • Session 1
    • The intertemporal-choice tests, in random order
    • Again, also in random order
    • Criterion tests
  • Session 2
    • The intertemporal-choice tests, in random order

A reasonable retest interval would be 5 weeks, but should we pay any heed to the relationship between the retest interval and the front-end delays?

(Which session-1 tests predict session-2 near better, near or far? It's probably better for the interval to be bigger than the front-end delay than smaller.)

Administering the intertemporal tests only once in session 1 would be one way to keep costs down.

Scoring the fixed-quartet test

How shall we score this test? At least trying to imitate Kirby's (2009) method as closely as possible probably makes sense to do at some point, but it is not necessarily the best choice of scoring procedure. In particular, I would like a scoring procedure that is unaffected by front-end delay, for easier comparison between near and far conditions, whereas Kirby (2009) uses hyperbolic discount rates.

The Kirby items are deliberately designed such that the logarithms of their hyperbolic discount rates of indifference are about equally spaced. This seems to also be the case for exponential discount rates:

kirby = data.frame(
    ssr = c(54, 47, 54, 49, 40, 34, 27, 25, 20),
    llr = c(55, 50, 60, 60, 55, 50, 50, 60, 55),
    delay = c(117, 160, 111, 89, 62, 30, 21, 14, 7))
rd(diff(with(kirby,
    log( (log(llr) - log(ssr)) / (delay - 0) ))))
  value
  0.903 0.898 0.874 0.814 0.917 0.825 0.757 0.838

However, I really don't want to be seen as taking a side on the exponential–hyperbolic debate here. And besides, if the spacing is uniform, it stands to reason that scoring with log discount rates should be about equivalent to scoring with the "rank" of the item (1, 2, etc.), which is a lot simpler. So let's do that.

For consistency with Kirby, we'll assign to each subject their most consistent rank, resolving ties by averaging all consistent ranks (using an arithmetic mean in our case since we aren't following this up with logarithms). To ensure that most scores are integers on the scale of 0 to 20, we'll also double them and subtract 2.

Single-test analyses

Current sample size:

c("All subjects" = sum(with(sb, tv %in% 5:8)),
    "Included subjects only" = sum(with(sb, tv %in% 5:8 & good)))
  value
All subjects 200
Included subjects only 181

Fixed quartets

Here all the scores in the near and far conditions (with two data points per subject, one for each round):

dodge(type, patience, discrete = T, stack.width = 20,
    data = ss(itf.scores, sb[char(s), "tv"] %in% 5:8 & s %in% good.s)) + boxp

r1-itf-near-vs-far.png

Here are the absolute retest differences in the near and far conditions:

dodge(type, abs(round2 - round1), discrete = T, stack.width = 20, data = dcast(
    ss(itf.scores, sb[char(s), "tv"] %in% 5:8 & s %in% good.s),
    s + type ~ round,
    value.var = "patience")) + boxp

r1-itf-retest-diffs.png

Here are test–retest correlations:

d = dcast(ss(itf.scores, sb[char(s), "tv"] %in% 5:8 & s %in% good.s),
    s ~ type + round, value.var = "patience")
rd(d = 2, data.frame(
   row.names = qw(near, far),
   pearson = c(cor(d$near_round1, d$near_round2, method = "pearson"),
       cor(d$far_round1, d$far_round2, method = "pearson")),
   spearman = c(cor(d$near_round1, d$near_round2, method = "spearman"),
       cor(d$far_round1, d$far_round2, method = "spearman"))))
  pearson spearman
near 0.85 0.81
far 0.72 0.71

Bisection

Here all the scores in the near and far conditions (with two data points per subject, one for each round):

dodge(type, disc,
        data = ss(itb.discs, sb[char(s), "tv"] %in% 5:8 & s %in% good.s)) +
    scale_y_continuous(breaks = seq(0, 1, .2)) +
    coord_cartesian(ylim = c(-.05, 1.05)) +
    boxp

r1-itb-near-vs-far.png

Here are the absolute retest differences in the near and far conditions:

fig = dodge(type, abs(round2 - round1), data = dcast(
    ss(itb.discs, sb[char(s), "tv"] %in% 5:8 & s %in% good.s),
    s + type ~ round,
    value.var = "disc")) + boxp
fig

r1-itb-retest-diffs.png

Let's zoom in:

fig + coord_cartesian(ylim = c(-.01, .11))

r1-itb-retest-diffs-zoomin.png

Here are test–retest correlations:

d = dcast(ss(itb.discs, sb[char(s), "tv"] %in% 5:8 & s %in% good.s),
    s ~ type + round, value.var = "disc")
rd(d = 2, data.frame(
   row.names = qw(near, far),
   pearson = c(cor(d$near_round1, d$near_round2, method = "pearson"),
       cor(d$far_round1, d$far_round2, method = "pearson")),
   spearman = c(cor(d$near_round1, d$near_round2, method = "spearman"),
       cor(d$far_round1, d$far_round2, method = "spearman"))))
  pearson spearman
near 0.76 0.80
far 0.81 0.81

Matching

Here all the scores in the near and far conditions (with two data points per subject, one for each round):

dodge(type, disc,
    data = ss(itm.discs, sb[char(s), "tv"] %in% 5:8 & s %in% good.s)) +
    scale_y_continuous(breaks = seq(0, 1, .2)) +
    coord_cartesian(ylim = c(-.05, 1.05)) +
    boxp

r1-itm-near-vs-far.png

Here are the absolute retest differences in the near and far conditions:

fig = dodge(type, abs(round2 - round1), data = dcast(
    ss(itm.discs, sb[char(s), "tv"] %in% 5:8 & s %in% good.s),
    s + type ~ round,
    value.var = "disc")) + boxp
fig

r1-itm-retest-diffs.png

Let's zoom in:

fig + coord_cartesian(ylim = c(-.01, .11))

r1-itm-retest-diffs-zoomin.png

Here are test–retest correlations:

d = dcast(ss(itm.discs, sb[char(s), "tv"] %in% 5:8 & s %in% good.s),
    s ~ type + round, value.var = "disc")
rd(d = 2, data.frame(
   row.names = qw(near, far),
   pearson = c(cor(d$near_round1, d$near_round2, method = "pearson"),
       cor(d$far_round1, d$far_round2, method = "pearson")),
   spearman = c(cor(d$near_round1, d$near_round2, method = "spearman"),
       cor(d$far_round1, d$far_round2, method = "spearman"))))
  pearson spearman
near 0.88 0.89
far 0.91 0.91

Criterion variables

transform(ss(sb, tv %in% 5:8 & good)[c("height.raw", "weight.raw", criterion.vs)],
    bmi = round(bmi))
  height.raw weight.raw bmi tobacco cigpacks gambling exercise healthy.meals floss cc.latefees cc.subpay savings
s0038     27 FALSE   0 14 1.00 14     0.00
s0042     22 FALSE   0 20 0.50 0 0 0.01 0.00
s0046     26 FALSE   0 8 0.90 0 0 1.00 0.30
s0051     42 FALSE   0 3 0.25 0 0 0.00 0.00
s0052     32 FALSE   0 4 0.02 7 2 0.01 0.20
s0053     24 FALSE   0 10 0.00 7 0 0.00 0.00
s0055     21 FALSE   0 2 0.75 14 0 0.40 0.20
s0056     25 FALSE   0 10 0.85 0 0 0.00 0.00
s0059     35 FALSE   1 3 0.75 0     0.06
s0062     20 FALSE   0 14 0.60 7 0 0.00 0.10
s0064     23 FALSE   3 10 0.67 7 0 0.00 0.10
s0066     23 FALSE   0 5 0.85 7 0 0.00 0.20
s0067     30 FALSE   0 10 0.75 7 0 0.00 0.35
s0068     22 FALSE   0 0 0.00 0 0 0.00 0.20
s0070     25 FALSE   0 7 0.80 0 0 0.00 0.30
s0078     30 FALSE   1 0 0.00 0 0 0.80 0.00
s0080     37 FALSE   0 10 0.25 5 0 0.40 0.05
s0082     24 FALSE   0 15 0.67 2 3 0.90 0.10
s0083     32 FALSE   0 25 0.10 7     0.01
s0084     23 FALSE   0 10 0.05 0     0.00
s0085 69in 147lbs 22 FALSE   2 2 0.75 4     0.40
s0087 5ft 105 pounds 21 TRUE 2 1 12 1.00 4     0.01
s0088 5 ft 5 in 140 pounds 23 FALSE   4 6 0.50 3 2 0.02 0.50
s0091 6 ft 5 in 170 pounds 20 FALSE   0 12 0.00 5 1 0.22 0.15
s0092 6 ft 1 in 150 pounds 20 FALSE   0 10 0.25 0 3 1.00 0.00
s0095 5 ft 6 in 115 pounds 19 FALSE   0 5 0.80 6 0 0.03 0.15
s0096 5 ft 11 in 180 pounds 25 FALSE   4 3 0.60 0 0 0.25 0.10
s0099 5 ft 5 in 190 pounds 32 FALSE   0 5 0.80 12 0 0.90 0.10
s0100 5 ft 6 in 133 pounds 21 FALSE   0 0 0.50 1 0 0.00 0.10
s0104 6 ft 0 in 207 pounds 28 FALSE   0 3 0.60 0 0 0.90 0.00
s0106 5 ft 1 in 150 pounds 28 FALSE   0 3 1.00 7 0 1.00 0.05
s0107 5 ft 10 in 200 pounds 29 FALSE   0 3 1.00 7 2 1.00 0.30
s0108 5 ft 11 in 155 pounds 22 TRUE 7 0 21 1.00 14 0 0.00 0.63
s0111 5 ft 6 in 120 lb 19 FALSE   0 3 1.00 7 0 0.00 0.15
s0113 6 ft 210 pounds 28 TRUE 2 0 10 0.30 21     0.00
s0116 6 ft 1 in 170 pounds 22 FALSE   2 4 0.02 1 0 0.00 0.15
s0117 5 ft 8 in 155 pounds 24 FALSE   0 5 0.80 4 0 0.00 1.00
s0118 5 ft 9 in 176 pounds 26 FALSE   0 10 0.00 4     0.00
s0119 5 ft 5 in 130 pounds 22 TRUE 6 7 5 0.25 7 0 0.00 0.10
s0120 6ft 2 in 205 pounds 26 FALSE   0 3 0.03 3     0.10
s0121 5 ft 10 in 172 pounds 25 TRUE 7 0 15 0.85 7 0 0.00 0.15
s0123 5 ft 10 in 190 pounds 27 FALSE   0 4 0.02 7 0 0.05 0.15
s0124 5 ft 6 in 160 pounds 26 TRUE 4 0 16 0.80 7     0.00
s0125 5 ft 8 in 160 pounds 24 FALSE   0 1 0.00 1 0 0.20 0.10
s0126 5 ft 2 in 250 pounds 46 FALSE   0 1 0.00 0 0 0.75 0.05
s0127 5 ft 3 in 230 pounds 41 TRUE 7 0 4 0.35 0     0.09
s0128 5 ft 6 in 145 pounds 23 FALSE   0 0 0.40 0 0 0.00 0.20
s0129 5 ft 6 in 130 pounds 21 TRUE 7 0 4 0.90 7 0 0.00 0.10
s0132 5 ft 9 in 150 lbs 22 FALSE   0 1 0.90 0 10 0.99 0.00
s0135 6 ft 3 in 180 pounds 22 FALSE   0 6 0.50 0 0 0.25 0.10
s0137 5 ft 11 in 170 pounds 24 TRUE 3 0 2 0.00 0     0.00
s0138 6 ft 3 in 190 pounds 24 FALSE   0 4 0.25 7     0.00
s0139 6 ft 1 in 285 pounds 38 FALSE   0 10 0.50 5 0 0.90 0.00
s0140 5 ft 7 in 130 lbs 20 FALSE   0 5 0.20 1 3 0.50 0.23
s0141 5ft 92lbs 18 TRUE 1 0 20 0.80 0 1 0.60 0.10
s0142 5 Ft 10 IN 240lbs 34 FALSE   0 5 0.00 0     0.00
s0143 65in 165 pounds 27 FALSE   1 5 0.65 5     0.00
s0149 5ft 7in 140 pounds 22 TRUE 7 0 35 0.60 2 0 0.00 0.10
s0152 5 ft 5 in 135 pounds 22 FALSE   0 2 0.10 5     0.01
s0159 6ft 185 lbs 25 FALSE   0 10 0.95 5     0.00
s0167 5 ft 6in 135 lbs 22 FALSE   1 3 0.60 6 0 0.02 0.60
s0185 5 ft 6 in 140 pounds 23 TRUE 2 0 1 0.90 7     0.60
s0192 5 ft 8 in 145 pounds 22 FALSE   0 6 0.85 7 0 0.50 0.25
s0195 5 ft 10 in 168 pounds 24 FALSE   0 6 0.50 2 0 0.00 0.25
s0196 68 inches 135 pounds 21 FALSE   0 5 1.00 7 0 0.02 0.20
s0198 6 ft 4 in 170 lb 21 FALSE   0 15 0.01 14     0.25
s0199 5 ft 9 in 150 pounds 22 FALSE   30 3 0.70 7     0.25
s0204 5 ft 2 in 145 pounds 27 FALSE   0 0 0.50 0 3 1.00 0.05
s0205 72 in 178 pounds 24 FALSE   0 4 0.30 0 0 0.00 0.05
s0206 5ft8in 148 pounds 23 FALSE   0 2 0.75 2 0 1.00 0.10
s0207 5 ft 5 in 130 pounds 22 FALSE   0 2 0.25 7 0 1.00 0.05
s0208 6ft 165 pounds 22 TRUE 3 0 25 0.10 7 8 0.25 0.00
s0209 6 ft 140 pounds 19 FALSE   0 35 0.40 0     0.10
s0210 5 ft 4 in 207 pounds 36 FALSE   0 4 0.00 7 4 0.15 0.05
s0213 5 ft 6 in 133 pounds 21 FALSE   0 0 0.75 0     0.10
s0216 5 ft 11 in 200 pounds 28 TRUE 4 0 5 0.80 7     0.10
s0218 5 ft. 5 in 150 pounds 25 FALSE   0 3 0.50 14 0 0.00 0.10
s0220 5ft 11in 190 lb 26 FALSE   0 5 0.50 0     0.00
s0222 5 ft 9 in 180 lbs 27 TRUE 2 15 10 0.33 3 0 0.90 0.20
s0223 6 ft 1 in 200 lbs 26 FALSE   0 3 0.50 2     0.00
s0224 5.9 ft 203 pound 28 TRUE 3 0 2 0.00 7 0 0.00 0.10
s0225 5 ft 4 inches 148 lbs. 25 FALSE   0 4 0.90 3 2 1.00 0.06
s0227 5ft6in 130 pounds 21 FALSE   0 19 1.00 4 0 1.00 0.00
s0230 6 ft 2 in 190 pounds 24 TRUE 3 0 3 0.06 5 0 0.24 0.05
s0231 5 ft 10 in 205 pounds 29 FALSE   0 3 0.20 2 40 1.00 0.00
s0232 6ft 1inch 170 pounds 22 FALSE   0 5 0.75 0     0.05
s0234 5 ft 1 in 140 lbs 26 FALSE   0 3 0.50 7     0.00
s0235 5 ft 5 in 145 pounds 24 FALSE   0 5 0.50 7 2 1.00 0.00
s0237 5 ft 8 in 154 pounds 23 TRUE 3 1 6 0.07 0 0 0.00 0.20
s0238 6ft 190 pounds 26 FALSE   1 10 0.10 1 3 0.99 0.05
s0239 6ft 2in 190 pounds 24 FALSE   5 10 0.90 7 0 0.25 0.00
s0241 6 ft 0 in 295 pounds 40 FALSE   0 5 0.60 7 0 0.00 0.20
s0242 5 ft 4 in 128 pounds 22 FALSE   0 5 0.90 7 0 0.00 0.50
s0243 6 ft 2 in 205 pounds 26 FALSE   0 1 0.25 2 0 0.12 0.05
s0244 6ft 250 pounds 34 FALSE   0 10 1.00 7 0 0.00 0.00
s0245 5ft 6inches 180 pounds 29 TRUE 7 0 25 0.10 7     0.05
s0246 5 ft 7 in 132 pounds 21 FALSE   0 6 0.20 1     0.20
s0248 5 ft 4 in 112 pounds 19 FALSE   0 6 0.40 7 0 0.95 0.07
s0249 5 ft 7 in 132 pounds 21 TRUE 5 0 5 0.30 7 0 0.03 0.00
s0250 5 ft 9 in 165 pounds 24 FALSE   0 5 0.05 0 0 0.00 0.25
s0252 6ft 1in 173 pounds 23 FALSE   4 15 0.03 7 1 0.00 0.40
s0253 6 ft 175 pounds 24 TRUE 4 0 2 0.20 6 0 0.90 0.05
s0258 6 ft 230 pounds 31 FALSE   0 6 0.20 14 0 0.00 0.01
s0260 5ft 11in 170 pounds 24 FALSE   0 10 0.01 4     0.10
s0261 5ft 5in 101 pounds 17 FALSE   0 1 0.01 7 0 0.00 0.05
s0262 6 ft 0 in 165 pounds 22 TRUE 4 0 8 0.60 0 0 0.25 0.10
s0263 5 ft 2 in 122lbs 22 FALSE   0 5 0.65 7 0 0.00 0.10
s0264 6ft 157 pounds 21 FALSE   0 10 0.80 3 2 0.24 0.00
s0265 5 ft 7 in 200 lbs 31 FALSE   0 3 0.00 0 0 0.30 0.10
s0267 5 ft 5 in 115 pounds 19 FALSE   0 5 0.90 3 2 0.24 0.24
s0268 6 ft 2 in 185 pounds 24 FALSE   0 1 0.02 5 0 0.00 0.42
s0270 5 ft 11 in 230 pounds 32 FALSE   0 2 0.65 2 0 0.55 0.25
s0271 5 ft 11 in 165 pounds 23 FALSE   0 6 0.50 3 0 0.60 0.15
s0272 6 ft 2 in 223 lbs 29 TRUE 4 4 6 0.75 2 0 0.22 0.18
s0273 6ft0in 175 pounds 24 FALSE   0 10 0.00 0 5 0.10 0.00
s0274 6 ft 3 in 198 pounds 25 FALSE   0 5 0.65 0 0 0.95 0.06
s0275 5 ft 7 in 178 pounds 28 FALSE   0 30 0.40 0     0.90
s0276 5ft 4 in 170 pounds 29 TRUE 2 0 0 0.50 0     0.00
s0277 5 ft 7 in 170 pounds 27 TRUE 7 0 7 0.80 14 1 0.75 0.05
s0278 5 ft 1 in 127 pounds 24 FALSE   0 0 0.10 0 0 0.00 0.20
s0283 5 ft 4 in 190 pounds 33 FALSE   1 3 0.00 14 0 0.00 0.10
s0284 5 ft 4 in 166 pounds 28 TRUE 7 2 3 0.30 14 1 0.10 0.00
s0285 5 ft 4 in 138 pounds 24 FALSE   0 6 1.00 7     0.20
s0286 5ft 8in 150pounds 23 TRUE 7 0 40 0.75 15 0 0.12 0.10
s0287 5 ft 8 in 160 pounds 24 TRUE 7 0 5 0.80 7     0.30
s0289 6 ft 2 in 185 pounds 24 FALSE   0 4 0.15 5 2 0.15 0.08
s0290 5 ft 9 in 145 pounds 21 FALSE   0 5 0.60 7 0 0.00 0.15
s0292 5ft 9in 155 lbs 23 FALSE   1 7 0.50 10 0 0.80 0.05
s0294 5 ft 5 in 150 pounds 25 FALSE   0 3 0.10 5 1 0.12 0.10
s0295 5ft 10in 150 pounds 22 FALSE   5 6 0.80 5 1 0.00 0.15
s0299 5 ft 6 in 183 pounds 30 TRUE 2 0 7 0.00 7 12 0.20 0.00
s0300 6 ft 2 in 280 lbs 36 FALSE   0 12 0.75 24 0 1.00 0.20
s0301 5 ft 6 in 185 pounds 30 FALSE   0 14 0.50 5 0 0.00 0.00
s0302 5 ft 8 in 190 pounds 29 FALSE   0 1 0.60 1 0 0.50 0.00
s0304 5 ft 7 in 141 pounds 22 FALSE   0 5 0.70 0 0 0.00 0.30
s0305 6 ft 160 pounds 22 FALSE   0 7 0.50 3 0 0.00 0.00
s0307 6ft 170 pounds 23 TRUE 1 0 4 0.80 6 0 0.00 0.75
s0308 5 ft. 7 in. 165 pounds 26 FALSE   0 2 0.95 0 0 0.35 0.10
s0309 5 ft 8 in 355 pounds 54 FALSE   0 0 0.02 0 1 0.24 0.00
s0310 5 ft 4 in 150 lbs 26 TRUE 2 3 5 0.60 0 0 0.30 0.30
s0312 5 ft 4 in 175 pounds 30 TRUE 3 0 6 0.45 0 0 0.00 0.00
s0313 5 ft 11 in 190 pounds 26 TRUE 14 0 5 0.80 0 0 0.00 0.10
s0314 5 ft 4 in 136 pounds 23 FALSE   0 8 0.75 7     0.20
s0315 5 ft 6 in 200 pounds 32 TRUE 3 0 5 0.25 7 0 0.50 0.10
s0316 5ft 11in 195 pounds 27 FALSE   0 4 0.55 1 2 0.99 0.01
s0317 5 ft 5 in 155 pounds 26 TRUE 1 0 9 0.60 5 1 0.60 0.20
s0318 6ft 1in 215lb 28 FALSE   0 4 0.50 2 0 0.75 0.20
s0319 6 ft 170 pounds 23 TRUE 6 0 8 0.06 0     0.00
s0321 6 ft 3 in 230 pounds 29 FALSE   0 18 0.95 7     0.50
s0322 5 ft 7 in 130 pounds 20 TRUE 2 0 4 0.00 0     0.00
s0323 5 ft 11 in 160 pounds 22 FALSE   1 1 0.25 0 0 0.00 0.15
s0326 5 ft 7 in 137 pounds 21 FALSE   0 7 0.50 0     0.10
s0327 5 ft 10 in 181 pounds 26 TRUE 2 0 2 0.70 0     0.01
s0328 64 in 157 pounds 27 FALSE   0 4 0.25 3 3 0.95 0.00
s0329 5 ft 4 in 150 pounds 26 FALSE   0 12 0.80 1 6 0.75 0.05
s0331 6 ft 0 in 170 pounds 23 FALSE   0 10 0.80 2 0 0.25 0.05
s0334 5 ft 10 in 140 lbs 20 FALSE   0 7 0.00 0 0 0.00 0.00
s0335 5 ft 9 in 180 pounds 27 FALSE   0 5 0.00 7 0 0.24 0.15
s0337 5 ft 10 in 160 pounds 23 FALSE   0 5 0.80 2     0.10
s0338 6ft 3in 186 pounds 23 FALSE   0 0 0.75 7 0 0.00 0.15
s0340 6 ft 0 in 242 pounds 33 TRUE 7 2 0 0.03 0 0 0.24 0.10
s0341 5ft 4in 155 pounds 27 FALSE   0 5 0.85 2 5 0.90 0.10
s0342 6ft 1 in 200 pounds 26 FALSE   0 5 0.50 0 0 1.00 0.00
s0343 5 ft 11 in 180 pounds 25 TRUE 5 0 3 0.20 4 0 0.00 0.10
s0344 5 ft 8 in 170 pounds 26 FALSE   1 10 0.03 1     0.10
s0345 6 ft 5 in 270 pounds 32 TRUE 3 0 7 0.50 4 0 0.00 0.50
s0346 5 ft 8 in 135 pounds 21 FALSE   0 0 0.80 7     0.10
s0347 5 ft 11 in 135 pounds 19 FALSE   0 1 0.10 0 0 0.24 0.05
s0348 5 ft 6 in 128 pounds 21 FALSE   0 7 1.00 14 0 0.00 0.20
s0349 5 ft 4 in 105 pounds 18 FALSE   0 2 0.05 1 0 0.01 0.40
s0350 5 ft 8 in 145 pounds 22 FALSE   0 7 0.90 14 0 1.00 0.90
s0353 5ft 10in 205 pounds 29 FALSE   5 10 0.75 1 1 0.50 0.20
s0354 5 ft 0 in 100 pounds 20 TRUE 14 0 5 0.01 0     0.03
s0358 5 ft 11 in 230 pounds 32 FALSE   1 7 0.50 0 3 0.75 0.05
s0359 5 ft 11 in 160 pounds 22 TRUE 0 0 3 0.75 6     0.20
s0361 6 ft 150 pounds 20 FALSE   0 15 0.70 7 0 0.00 0.05
s0362 5 ft 3 in 135 pounds 24 TRUE 2 5 35 0.85 7     0.01
s0363 6 ft 2 in 230 pounds 30 FALSE   15 1 0.00 4 0 0.00 0.00
s0364 6 ft 1 in 230 pounds 30 FALSE   1 10 0.75 2 0 0.00 0.25
s0368 73 inches 195 pounds 26 FALSE   0 5 0.30 6 0 0.00 0.36
s0369 64inches 128 pounds 22 FALSE   0 2 0.00 0     0.00

Relationships between econometric measures

We will consider only round 1.

rd(d = 2, abbcor(ss(sb, good & tv %in% 5:8)[qw(itf.patience.near.r1, itb.disc.near.r1, itm.disc.near.r1)],
    method = "kendall"))
  itb.disc.near.r1 itm.disc.near.r1
itf.patience.near.r1 0.65 0.46
itb.disc.near.r1   0.48
rd(d = 2, abbcor(ss(sb, good & tv %in% 5:8)[qw(itf.patience.far.r1, itb.disc.far.r1, itm.disc.far.r1)],
    method = "kendall"))
  itb.disc.far.r1 itm.disc.far.r1
itf.patience.far.r1 0.55 0.42
itb.disc.far.r1   0.49

It looks like the different measures are generally related to each other, but not so tightly that one can be inferred from another. Good, that's what I expected.

Convergent prediction

convergent.validity = function(dist, ses)
   {linreg.r2 = function(iv, dv, min.y, max.y)
       {pred = crossvalid(iv, dv, nfold = 10, f = function(train.iv, train.dv, test.iv)
           {model = lm(y ~ x, data.frame(x = train.iv, y = train.dv))
            predict(model, newdata = data.frame(x = test.iv))})
        pred = pmin(max.y, pmax(min.y, pred))
        1 - sum((dv - pred)^2) / sum((dv - mean(dv))^2)}
    nameify = function(f, x)
        f(x, qw(Fixed, Bisection, Matching))
    nameify(`rownames<-`, nameify(`colnames<-`, rd(d = 2, 
        do.call(cbind, lapply(list(itf.scores, itb.discs, itm.discs), function(dv.d)
           {sapply(list(itf.scores, itb.discs, itm.discs), function(iv.d)
               {if (identical(iv.d, dv.d))
                    return(NA)
                iv = ss(iv.d, s %in% ses & round == "round1" & type == dist)[,ncol(iv.d)]
                dv = ss(dv.d, s %in% ses & round == "round1" & type == dist)[,ncol(dv.d)]
                linreg.r2(iv, dv,
                    min.y = 0,
                    max.y = if (identical(dv.d, itf.scores)) 20 else 1)})})))))}
convergent.validity("near", good.s.5to8)
  Fixed Bisection Matching
Fixed   0.51 0.30
Bisection 0.52   0.34
Matching 0.31 0.34  
convergent.validity("far", good.s.5to8)
  Fixed Bisection Matching
Fixed   0.41 0.24
Bisection 0.40   0.43
Matching 0.24 0.44  

Stationarity

We will get stationarity scores, or rather, nonstationarity scores, by subtracting patience with a 30-day front-end delay from patience with no front-end delay. If far - near > 0, then far > near, so people are more patient in the far case, so positive scores on these measures mean classicial nonstationarity (being more patient in the future than the present) and negative scores mean backwards nonstationarity.

d = with(ss(dfn(sb), good & tv %in% 5:8), rbind(
    data.frame(type = "F", s = RNAME, v = itf.patience.far.r1 - itf.patience.near.r1),
    data.frame(type = "B", s = RNAME, v = itb.disc.far.r1 - itb.disc.near.r1),
    data.frame(type = "M", s = RNAME, v = itm.disc.far.r1 - itm.disc.near.r1)))

First, the count of subjects with each nonstationarity direction for each measure.

t(daply(d, .(type), function(d) table(factor(sign(d$v), levels = c(-1, 0, 1)))))
  F B M
-1 75 95 100
0 64 0 10
1 42 86 71

Surprisingly, backwards nonstationarity is more common than classical nonstationarity. This replicates Sayman and Öncüler's (2009) finding for (real, longitudinally observed) dynamic inconsistency.

Here is a (not terribly useful) figure.

dodge(type, v, data = transform(d, v = ifelse(type == "F",
      v/20 + runif(nrow(d), -.025, .025),
      v))) +
   coord_cartesian(ylim = c(-.5, .5)) +
   boxp

itb-stationarity.png

This information is more easily digested from a table of quantiles.

t(daply(d, .(type), function(d) rd(quantile(d$v, c(.025, .25, .5, .75, .975)))))
  F B M
2.5% -7.000 -0.254 -0.189
25% -2.000 -0.044 -0.067
50% 0.000 -0.001 -0.008
75% 0.000 0.055 0.031
97.5% 7.000 0.282 0.275

We see that for the most part, nonstationarity was small. That is, subjects' choices were not much affected by the front-end delay. Here are quantiles for the absolute differences:

t(daply(d, .(type), function(d) rd(quantile(abs(d$v), c(.25, .5, .75, .95)))))
  F B M
25% 0.000 0.017 0.015
50% 2.000 0.051 0.048
75% 3.000 0.115 0.098
95% 7.000 0.275 0.224
Concordance of stationarity between families

Here's the percent of stationarity scores that agreed in sign:

sign.agreement = function(d, t1, t2)
   `names<-`(mean(sign(ss(d, type == t1)$v) == sign(ss(d, type == t2)$v)),
       paste(t1, "and", t2))

rd(c(
   sign.agreement(d, "F", "B"),
   sign.agreement(d, "F", "M"),
   sign.agreement(d, "B", "M")))
  value
F and B 0.337
F and M 0.376
B and M 0.519

Presumably the comparisons involving F have clearly lower percentages than the comparison between B and M because only in the fixed test, due to its discrete nature, could one readily get a stationarity score of 0.

Removing all subjects from the dataset with stationarity 0 in any family yields:

d2 = ss(d, !(s %in% ss(d, v == 0)$s))
c("New sample size" = length(unique(d2$s)))
  value
New sample size 112

Now let's redo the sign-agreement analysis:

d2 = ss(d, !(s %in% ss(d, v == 0)$s))

rd(c(
   sign.agreement(d2, "F", "B"),
   sign.agreement(d2, "F", "M"),
   sign.agreement(d2, "B", "M")))
  value
F and B 0.518
F and M 0.562
B and M 0.545

Now the three figures look about the same.

Here are Kendall correlations of stationarity scores (back to including subjects with stationarity 0):

mag.corr = function(t1, t2)
   `names<-`(cor(ss(d, type == t1)$v, ss(d, type == t2)$v, method = "kendall"),
       paste(t1, "and", t2))

rd(c(
   mag.corr("F", "B"),
   mag.corr("F", "M"),
   mag.corr("B", "M")))
  value
F and B 0.059
F and M 0.032
B and M 0.064

Those look very small. At least they're all positive.

Predicting the criterion variables

A correlation matrix of the criterion variables

round(d = 2, abbcor(
    sb[good.s, qw(bmi, exercise, healthy.meals, savings,
        tobacco, gambling, floss, cc.latefees, cc.subpay)],
    method = "kendall",
    use = "complete.obs"))
  exercise healthy.meals savings tobacco gambling floss cc.latefees cc.subpay
bmi -0.07 -0.08 -0.09 0.03 0.09 -0.02 0.05 0.10
exercise   0.23 -0.03 0.11 0.03 0.13 0.09 -0.01
healthy.meals     0.15 -0.03 -0.08 0.16 -0.06 0.04
savings       -0.01 0.06 0.11 -0.17 -0.22
tobacco         0.15 0.05 -0.06 -0.04
gambling           -0.03 0.00 -0.02
floss             -0.04 -0.10
cc.latefees               0.31

Possibly one could use this correlation matrix to make loose inferences about the maximum predictability of the criterion variables using some other variable, the logic being that if there is some other variable that can predict all of them, they necessarily should be predictable from each other. But it's complicated.

Preparing the criterion variables

BMI is negatively skewed:

dodge(1, ss(sb, good & tv %in% 5:8)$bmi)

r1-bmi.png

So let's use the logarithm instead, and also clip that high point down to the second-highest point. I wouldn't consider it an outlier, but I don't want to overweight an extreme value like that, for training or for testing.

For smoking, the number of packs of cigarettes has a funny-looking distribution:

table(ss(sb, good & tv %in% 5:8)$cigpacks, useNA = "always")
  count
0 1
1 3
2 10
3 8
4 5
5 2
6 2
7 11
14 2
NA 137

Let's stick to predicting just whether or not the subject uses tobacco, at least to begin with.

For gambling and late fees for paying off credit cards, let us also do dichtomous prediction, since so few of our subjects endorse a count above 0 for either:

table(ss(sb, good & tv %in% 5:8)$gambling > 0)
  count
FALSE 150
TRUE 31
table(ss(sb, good & tv %in% 5:8)$cc.latefees > 0, useNA = "always")
  count
FALSE 100
TRUE 34
NA 47

Notice that, surprisingly, more subjects lack a credit card entirely than admit to having paid even one late fee in the past two years.

Exercise, like BMI, is negatively skewed:

dodge(1, exercise, data = ss(sb, good & tv %in% 5:8))

r1-exercise.png

So we'll again use the logarithm (plus one to compensate for zeroes).

For healthy eating, as well as for a few other variables, subjects could enter one of 0%, 1%, 2%, …, 100%. This prohibits immediate use of a logistic transformation (since logit0 and logit1 are infinite). We will clip 0% to 0.5% and 100% to 99.5% before applying logit.

Flossing has another odd distribution, which suggests a tripartite representation:

with(ss(sb, good & tv %in% 5:8), table(ordered(
    ifelse(floss == 0, "<weekly", 
        ifelse(floss < 7, "<daily", ">=daily")),
    levels = c("<weekly", "<daily", ">=daily"))))
  count
<weekly 53
<daily 63
>=daily 65

For cc.subpay, most subjects with credit cards admit to sub-paying at least once in the last two years…

with(ss(sb, good & tv %in% 5:8), table(cc.subpay > 0))
  count
FALSE 53
TRUE 81

…but the distribution for cc.subpay > 0 is odd, so I think I'll stick to dichotomous prediction.

dodge(1, cc.subpay, data = ss(sb, good & tv %in% 5:8 &
    !is.na(cc.subpay) & cc.subpay > 0))

r1-ccsubpay.png

We'll treat savings like healthy.meals except that we'll clip a single very high point, as for BMI.

sb2 = within(ss(sb, good & tv %in% 5:8),
   {bmi[which.max(bmi)] = sort(bmi, dec = T)[2]
    bmi = log(bmi)
    gambling = gambling > 0
    exercise = log(exercise + 1)
    healthy.meals = logit(pmin(.995, pmax(.005, healthy.meals)))
    floss = ordered(
        ifelse(floss == 0, "<weekly", 
            ifelse(floss < 7, "<daily", ">=daily")),
        levels = c("<weekly", "<daily", ">=daily"))
    cc.latefees = cc.latefees > 0
    cc.subpay = cc.subpay > 0
    savings[which.max(savings)] = sort(savings, dec = T)[2]
    savings = logit(pmin(.995, pmax(.005, savings)))})

vlist = list(
    c("bmi", "ols"),
    c("exercise", "ols"),
    c("healthy.meals", "ols"),
    c("savings", "ols"),
    c("tobacco", "logistic"),
    c("gambling", "logistic"),
    c("floss", "ordinal"),
    # Perform the credit card-related analyses on just the subjects
    # who said they have a credit card.
    c("cc.latefees", "logistic", "remove.nas"),
    c("cc.subpay", "logistic", "remove.nas"))

discrete.vlist = Filter(function(x) x[[2]] == "logistic" || x[[2]] == "ordinal", vlist)
continuous.vlist = Filter(function(x) x[[2]] == "ols", vlist)

First try

m = sapply(vlist, function(x)
   {var = x[1]
    modeltype = x[2]
    remove.nas = length(x) >= 3
    sbf = if (remove.nas) sb2[!is.na(sb2[[var]]),] else sb2
    y = sbf[[var]]
    rd(c((if (modeltype == "logistic") max(mean(y), 1 - mean(y))
        else if (modeltype == "ordinal") max(prop.table(tabulate(y)))
        else sd(y)),
        sapply(qw(itf.patience, itb.disc, itm.disc), function(s)
       {d = sbf[paste0(s, c(".near.r1", ".far.r1"))]
        names(d) = qw(near, far)
        if (modeltype == "logistic")
           {pred = crossvalid(d, y, nfold = 10, f = function(train.iv, train.dv, test.iv)
               {model = glm(y ~ near * far, cbind(train.iv, y = train.dv),
                    family = binomial(link = "logit"))
                predict(model, type = "response", newdata = test.iv)})
            mean((pred >= .5) == y)}
        else if (modeltype == "ordinal")
           {pred = crossvalid(d, y, nfold = 10, f = function(train.iv, train.dv, test.iv)
               {model = polr(y ~ near * far, cbind(train.iv, y = train.dv),
                    method = "logistic")
                char(predict(model, type = "class", newdata = test.iv))})
            mean(pred == char(y))}
        else
           {pred = crossvalid(d, y, nfold = 10, f = function(train.iv, train.dv, test.iv)
               {model = lm(y ~ near * far, cbind(train.iv, y = train.dv))
                predict(model, newdata = test.iv)})
            sqrt(mean( (pred - y)^2 ))}})))})

colnames(m) = sapply(vlist, function(x) x[[1]])
m
  bmi exercise healthy.meals savings tobacco gambling floss cc.latefees cc.subpay
  0.181 0.793 2.701 1.840 0.757 0.829 0.359 0.746 0.604
itf.patience 0.186 0.803 2.704 1.877 0.757 0.823 0.304 0.731 0.582
itb.disc 0.185 0.767 2.739 1.845 0.757 0.829 0.370 0.731 0.604
itm.disc 0.186 0.806 2.721 1.864 0.757 0.823 0.331 0.739 0.575

In this table, the top row measures base variability (the SD for continuous variables, and the base rate of the modal class for discrete variables), and the other rows indicate cross-validation estimates of test error (the RMSE for continuous variables, and the correct classification rate for discrete variables).

Surprisingly, despite good preliminary evidence that our econometric tests are reliable (in terms of immediate test–retest reliability and intercorrelations between tests), none of these simple analyses with the criterion variables as DV came close to working. In fact, the only improvement I see in the above table is that the itb-based model predicting exercise achieved a RMSE of .77 compared to the SD of .79. This is not a meaningful improvement. Either the econometric tests are quite useless for predicting the criterion variables, or we need fancier models. Let's try fancier models.

Discrete criterion variables

First, plots. These are crude but hopefully better than nothing.

its = qw(itf.patience.near.r1, itf.patience.far.r1,
    itb.disc.near.r1, itb.disc.far.r1,
    itm.disc.near.r1, itm.disc.far.r1)
dichot.plot.df = transform(
    melt(sb2[c(its, qw(tobacco, gambling, cc.latefees, cc.subpay))],
        id.vars = qw(tobacco, gambling, cc.latefees, cc.subpay)),
    value = ifelse(
        variable %in% qw(itf.patience.near.r1, itf.patience.far.r1),
        value/20 + runif(length(value), -.02, .02),
        value))
dichot.plot = function(var, drop.nas = F)
  {if (drop.nas)
       dichot.plot.df = dichot.plot.df[!is.na(dichot.plot.df[[char(var)]]),]
   eval(bquote(dodge(.(var), value, faceter = variable, data = dichot.plot.df))) +
       scale_x_discrete(breaks = c(0, 1)) +
       coord_cartesian(xlim = c(.5, 2.5)) +
       boxp}
dichot.plot(quote(tobacco))

r1-tobacco-alliv.png

dichot.plot(quote(gambling))

r1-gambling-alliv.png

dichot.plot(quote(cc.latefees), drop.nas = T)

r1-cclatefees-alliv.png

dichot.plot(quote(cc.subpay), drop.nas = T)

r1-ccsubpay-alliv.png

These look a little better than the first-try analyses above suggested.

Let's try nearest-neighbors:

m = rd(sapply(discrete.vlist, function(x)
   {var = x[1]
    remove.nas = length(x) >= 3
    sbf = if (remove.nas) sb2[!is.na(sb2[[var]]),] else sb2
    y = sbf[[var]]
    if (is.ordered(y))
        class(y) = "factor"
    c(max(prop.table(table(y))), sapply(qw(itf.patience, itb.disc, itm.disc), function(s)
       {near = sbf[[paste0(s, ".near.r1")]]
        far = sbf[[paste0(s, ".far.r1")]]
        # Since 'near' and 'far' are two slight variations of the
        # same test, we won't rescale them.
        pred = crossvalid(cbind(near, far), y, nfold = 10,
            f = function(train.iv, train.dv, test.iv)
               knn.autok(train.iv, test.iv, train.dv))
        mean(pred == y)}))}))

colnames(m) = sapply(discrete.vlist, function(x) x[[1]])
m
  tobacco gambling floss cc.latefees cc.subpay
  0.757 0.829 0.359 0.746 0.604
itf.patience 0.740 0.812 0.271 0.709 0.552
itb.disc 0.768 0.790 0.238 0.709 0.507
itm.disc 0.707 0.790 0.343 0.694 0.515

The cross-validatory estimates of test-set accuracy are, with one small exception, lower than the base rate of the modal class. So much for that.

How about random forests?

m = rd(sapply(discrete.vlist, function(x)
   {var = x[1]
    remove.nas = length(x) >= 3
    sbf = if (remove.nas) sb2[!is.na(sb2[[var]]),] else sb2
    y = factor(sbf[[var]])
    if (is.ordered(y))
        class(y) = "factor"
    c(max(prop.table(table(y))), sapply(qw(itf.patience, itb.disc, itm.disc), function(s)
       {near = sbf[[paste0(s, ".near.r1")]]
        far = sbf[[paste0(s, ".far.r1")]]
        rf = randomForest(x = cbind(near, far), y = y,
            ntree = 2000,
            mtry = 2) # Since there are only two predictors.
        1 - tail(rf$err.rate, 1)[1]}))}))

colnames(m) = sapply(discrete.vlist, function(x) x[[1]])
m
  tobacco gambling floss cc.latefees cc.subpay
  0.757 0.829 0.359 0.746 0.604
itf.patience 0.713 0.785 0.376 0.642 0.545
itb.disc 0.773 0.757 0.298 0.672 0.470
itm.disc 0.685 0.724 0.414 0.657 0.500

This time we have only two cases for which estimated test-set accuracy exceeds the base rate of the modal class. One of these, itm used to predict flossing, is large enough to be meaningful (41% versus 36%) but still quite small.

How about SVMs?

m = rd(sapply(discrete.vlist, function(x)
   {var = x[1]
    remove.nas = length(x) >= 3
    sbf = if (remove.nas) sb2[!is.na(sb2[[var]]),] else sb2
    y = sbf[[var]]
    if (is.ordered(y))
        class(y) = "factor"
    c(max(prop.table(table(y))), sapply(qw(itf.patience, itb.disc, itm.disc), function(s)
       {near = sbf[[paste0(s, ".near.r1")]]
        far = sbf[[paste0(s, ".far.r1")]]
        # Since 'near' and 'far' are two slight variations of the
        # same test, we won't rescale them.
        pred = crossvalid(cbind(near, far), y, nfold = 10,
            f = function(train.iv, train.dv, test.iv)
               {model = best.svm(x = train.iv, y = train.dv,
                    scale = F, type = "C-classification",
                    gamma = 10^(-6:-6),
                    cost = 10^(-6:6))
                char(predict(model, test.iv))})
        mean(pred == char(y))}))}))

colnames(m) = sapply(discrete.vlist, function(x) x[[1]])
m
  tobacco gambling floss cc.latefees cc.subpay
  0.757 0.829 0.359 0.746 0.604
itf.patience 0.757 0.829 0.271 0.746 0.604
itb.disc 0.757 0.829 0.287 0.746 0.604
itm.disc 0.757 0.829 0.265 0.746 0.604

At their best, the SVMs are constantly predicting the modal class.

Using a scoring rule
mean.nll2 = function(pv, outcomes)
    mean(-log( ifelse(outcomes, pv, 1 - pv), 2 ))

f = function(vname, modeltype, na.rm = F)
   {sbf = if (na.rm) sb2[!is.na(sb2[[vname]]),] else sb2
    y = sbf[[vname]]
    if (is.logical(y))
        y = num(y)
    round(digits = 2, c(mean.nll2(mean(y), y),
        sapply(qw(itf.patience, itb.disc, itm.disc), function(s)
       {d = sbf[paste0(s, c(".near.r1", ".far.r1"))]
        names(d) = qw(near, far)
        if (modeltype == "binary")
           {pred = crossvalid(d, y, nfold = 10, f = function(train.iv, train.dv, test.iv)
               {model = glm(y ~ near * far, cbind(train.iv, y = train.dv),
                    family = binomial(link = "logit"))
                predict(model, type = "response", newdata = test.iv)})
            mean.nll2(pred, y)}})))}
m = data.frame(check.names = F,
    "Tobacco use" = f("tobacco", "binary"),
    Gambling = f("gambling", "binary"),
    "CC late fees" = f("cc.latefees", "binary", na.rm = T),
    "CC subpayment" = f("cc.subpay", "binary", na.rm = T))
m = rbind(m[1,], NA, m[-1,])
m
  Tobacco use Gambling CC late fees CC subpayment
  0.80 0.66 0.82 0.97
2        
itf.patience 0.81 0.68 0.82 0.99
itb.disc 0.81 0.65 0.84 1.02
itm.disc 0.82 0.68 0.89 0.98

No improvement.

Continuous criterion variables

Kernel regression:

m = rd(sapply(continuous.vlist, function(x)
   {var = x[1]
    remove.nas = length(x) >= 3
    sbf = if (remove.nas) sb2[!is.na(sb2[[var]]),] else sb2
    y = sbf[[var]]
    c(sd(y), sapply(qw(itf.patience, itb.disc, itm.disc), function(s)
       {iv = sbf[paste0(s, qw(.near.r1, .far.r1))]
        names(iv) = qw(near, far)
        pred = crossvalid(iv, y, nfold = 10,
            f = function(train.iv, train.dv, test.iv)
              {d = data.frame(train.iv, y = train.dv)
               predict(
                   npreg(y ~ near + far, data = d,
                       regtype = "ll", ckertype = "epanechnikov"),
                   newdata = test.iv)})
        sqrt(mean((pred - y)^2))}))}))

colnames(m) = sapply(continuous.vlist, function(x) x[[1]])
m
  bmi exercise healthy.meals savings
  0.181 0.793 2.701 1.840
itf.patience 0.184 0.820 2.733 1.939
itb.disc 0.183 0.772 2.712 1.853
itm.disc 0.190 0.832 2.661 1.938

It looks like this kernel-regression technique did even worse than linear regression.

Let's try random forests again:

m = rd(sapply(continuous.vlist, function(x)
   {var = x[1]
    remove.nas = length(x) >= 3
    sbf = if (remove.nas) sb2[!is.na(sb2[[var]]),] else sb2
    y = sbf[[var]]
    c(sd(y), sapply(qw(itf.patience, itb.disc, itm.disc), function(s)
       {near = sbf[[paste0(s, ".near.r1")]]
        far = sbf[[paste0(s, ".far.r1")]]
        rf = randomForest(x = cbind(near, far), y = y,
            ntree = 2000,
            mtry = 2) # Since there are only two predictors.
        sqrt(mean((rf$predicted - y)^2))}))}))

colnames(m) = sapply(continuous.vlist, function(x) x[[1]])
m
  bmi exercise healthy.meals savings
  0.181 0.793 2.701 1.840
itf.patience 0.201 0.882 3.058 2.028
itb.disc 0.208 0.815 2.965 2.040
itm.disc 0.206 0.804 2.874 2.109

Shoot, that was really bad.

Association and significance testing

Correlation tests and t-tests

For the continuous criterion variables (and also flossing), we do Pearson correlation tests.

m = do.call(rbind, lapply(c(continuous.vlist, list("floss")), function(x)
   {dv = x[1]
    y = num(sb2[[dv]])
    cbind(dv, do.call(rbind, lapply(qw(itf.patience.near.r1, itf.patience.far.r1, itb.disc.near.r1, itb.disc.far.r1, itm.disc.near.r1, itm.disc.far.r1), function(iv)
       {ct = cor.test(sb2[,iv], y, method = "pearson")
        data.frame(iv,
            r = round(ct$estimate, 2),
            p = round(ct$p.value, 3),
            s = (if (ct$p.value < .05) "*" else ""))})))}))
rownames(m) = c()
m
  dv iv r p s
1 bmi itf.patience.near.r1 -0.01 0.885  
2 bmi itf.patience.far.r1 0.04 0.567  
3 bmi itb.disc.near.r1 0.03 0.686  
4 bmi itb.disc.far.r1 -0.02 0.778  
5 bmi itm.disc.near.r1 -0.02 0.740  
6 bmi itm.disc.far.r1 -0.01 0.887  
7 exercise itf.patience.near.r1 -0.10 0.191  
8 exercise itf.patience.far.r1 -0.11 0.126  
9 exercise itb.disc.near.r1 -0.25 0.001 *
10 exercise itb.disc.far.r1 -0.10 0.199  
11 exercise itm.disc.near.r1 -0.16 0.030 *
12 exercise itm.disc.far.r1 -0.19 0.012 *
13 healthy.meals itf.patience.near.r1 0.00 0.974  
14 healthy.meals itf.patience.far.r1 -0.03 0.690  
15 healthy.meals itb.disc.near.r1 0.00 0.990  
16 healthy.meals itb.disc.far.r1 0.11 0.127  
17 healthy.meals itm.disc.near.r1 0.13 0.077  
18 healthy.meals itm.disc.far.r1 0.12 0.112  
19 savings itf.patience.near.r1 0.04 0.567  
20 savings itf.patience.far.r1 0.06 0.432  
21 savings itb.disc.near.r1 0.09 0.216  
22 savings itb.disc.far.r1 0.16 0.037 *
23 savings itm.disc.near.r1 0.10 0.171  
24 savings itm.disc.far.r1 0.07 0.321  
25 floss itf.patience.near.r1 0.08 0.313  
26 floss itf.patience.far.r1 0.07 0.377  
27 floss itb.disc.near.r1 0.07 0.319  
28 floss itb.disc.far.r1 0.06 0.412  
29 floss itm.disc.near.r1 0.09 0.251  
30 floss itm.disc.far.r1 0.11 0.159  

For the dichotomous criterion variables, we do t-tests, treating the criterion variables as independent (grouping) variables and time preferences as dependent variables.

m = do.call(rbind, lapply(discrete.vlist, function(x)
   {dv = x[1]
    modeltype = x[2]
    if (modeltype == "ordinal")
        return(data.frame())
    na.rm = length(x) >= 3
    sbf = if (na.rm) sb2[!is.na(sb2[[dv]]),] else sb2
    y = num(sbf[[dv]])
    cbind(iv = dv, do.call(rbind, lapply(qw(itf.patience.near.r1, itf.patience.far.r1, itb.disc.near.r1, itb.disc.far.r1, itm.disc.near.r1, itm.disc.far.r1), function(iv)
       {tt = t.test(var.eq = T, sb2[y == 0, iv], sb2[y == 1, iv])
        data.frame(dv = iv,
            p = round(tt$p.value, 3),
            s = (if (tt$p.value < .05) "*" else ""))})))}))
rownames(m) = c()
m
  iv dv p s
1 tobacco itf.patience.near.r1 0.079  
2 tobacco itf.patience.far.r1 0.177  
3 tobacco itb.disc.near.r1 0.023 *
4 tobacco itb.disc.far.r1 0.207  
5 tobacco itm.disc.near.r1 0.526  
6 tobacco itm.disc.far.r1 0.503  
7 gambling itf.patience.near.r1 0.065  
8 gambling itf.patience.far.r1 0.177  
9 gambling itb.disc.near.r1 0.004 *
10 gambling itb.disc.far.r1 0.069  
11 gambling itm.disc.near.r1 0.007 *
12 gambling itm.disc.far.r1 0.054  
13 cc.latefees itf.patience.near.r1 0.445  
14 cc.latefees itf.patience.far.r1 0.550  
15 cc.latefees itb.disc.near.r1 0.470  
16 cc.latefees itb.disc.far.r1 0.981  
17 cc.latefees itm.disc.near.r1 0.324  
18 cc.latefees itm.disc.far.r1 0.208  
19 cc.subpay itf.patience.near.r1 0.620  
20 cc.subpay itf.patience.far.r1 0.083  
21 cc.subpay itb.disc.near.r1 0.352  
22 cc.subpay itb.disc.far.r1 0.124  
23 cc.subpay itm.disc.near.r1 0.162  
24 cc.subpay itm.disc.far.r1 0.059  

Regression

Now we do significance tests with the models we used for the predictive analyses.

m = do.call(rbind, lapply(c(continuous.vlist, list("floss")), function(x)
   {dv = x[1]
    y = num(sb2[[dv]])
    cbind(dv, do.call(rbind, lapply(qw(itf.patience, itb.disc, itm.disc), function(s)
       {d = data.frame(
           near = scale(sb2[[paste0(s, ".near.r1")]]),
           far = scale(sb2[[paste0(s, ".far.r1")]]))
        cm = coef(summary(lm(y ~ near * far, data = d)))[-1,]
#        cat(dv, s, round(d = 3, summary(lm(y ~ near * far, data = d))$r.squared), "\n")
        data.frame(
            ivs = (if (s == "itf.patience") "fixed"
               else if (s == "itb.disc") "bisection"
               else "matching"),
            coef = rownames(cm),
            p = round(d = 3, cm[,"Pr(>|t|)"]),
            s = ifelse(cm[,"Pr(>|t|)"] < .05, "*", ""))})))}))
rownames(m) = c()
m
  dv ivs coef p s
1 bmi fixed near 0.447  
2 bmi fixed far 0.292  
3 bmi fixed near:far 0.861  
4 bmi bisection near 0.379  
5 bmi bisection far 0.408  
6 bmi bisection near:far 0.956  
7 bmi matching near 0.687  
8 bmi matching far 0.784  
9 bmi matching near:far 0.972  
10 exercise fixed near 0.443  
11 exercise fixed far 0.357  
12 exercise fixed near:far 0.131  
13 exercise bisection near 0.000 *
14 exercise bisection far 0.102  
15 exercise bisection near:far 0.241  
16 exercise matching near 0.997  
17 exercise matching far 0.201  
18 exercise matching near:far 0.893  
19 healthy.meals fixed near 0.188  
20 healthy.meals fixed far 0.672  
21 healthy.meals fixed near:far 0.012 *
22 healthy.meals bisection near 0.067  
23 healthy.meals bisection far 0.031 *
24 healthy.meals bisection near:far 0.029 *
25 healthy.meals matching near 0.712  
26 healthy.meals matching far 0.546  
27 healthy.meals matching near:far 0.045 *
28 savings fixed near 0.939  
29 savings fixed far 0.580  
30 savings fixed near:far 0.766  
31 savings bisection near 0.681  
32 savings bisection far 0.089  
33 savings bisection near:far 0.796  
34 savings matching near 0.397  
35 savings matching far 0.851  
36 savings matching near:far 0.530  
37 floss fixed near 0.396  
38 floss fixed far 0.806  
39 floss fixed near:far 0.316  
40 floss bisection near 0.571  
41 floss bisection far 0.892  
42 floss bisection near:far 0.889  
43 floss matching near 0.867  
44 floss matching far 0.386  
45 floss matching near:far 0.756  
m = do.call(rbind, lapply(c(discrete.vlist), function(x)
   {dv = x[1]
    modeltype = x[2]
    if (modeltype == "ordinal")
        return(data.frame())
    na.rm = length(x) >= 3
    sbf = if (na.rm) sb2[!is.na(sb2[[dv]]),] else sb2
    y = num(sbf[[dv]])
    cbind(dv, do.call(rbind, lapply(qw(itf.patience, itb.disc, itm.disc), function(s)
       {d = data.frame(
           near = scale(sbf[[paste0(s, ".near.r1")]]),
           far = scale(sbf[[paste0(s, ".far.r1")]]))
        cm = coef(summary(glm(y ~ near * far, data = d, family = binomial)))[-1,]
        predps = predict(
            glm(y ~ near * far, data = d, family = binomial(link = "probit")),
            type = "response")
        cat(dv, s, round(d = 3, 1 -
            sum((y - predps)^2) / sum((y - mean(y))^2)), "\n")
        data.frame(
            ivs = (if (s == "itf.patience") "fixed"
               else if (s == "itb.disc") "bisection"
               else "matching"),
            coef = rownames(cm),
            p = round(d = 3, cm[,"Pr(>|z|)"]),
            s = ifelse(cm[,"Pr(>|z|)"] < .05, "*", ""))})))}))
rownames(m) = c()
m
  dv ivs coef p s
1 tobacco fixed near 0.211  
2 tobacco fixed far 0.934  
3 tobacco fixed near:far 0.680  
4 tobacco bisection near 0.063  
5 tobacco bisection far 0.607  
6 tobacco bisection near:far 0.699  
7 tobacco matching near 0.856  
8 tobacco matching far 0.852  
9 tobacco matching near:far 0.763  
10 gambling fixed near 0.112  
11 gambling fixed far 0.986  
12 gambling fixed near:far 0.264  
13 gambling bisection near 0.029 *
14 gambling bisection far 0.802  
15 gambling bisection near:far 0.201  
16 gambling matching near 0.065  
17 gambling matching far 0.662  
18 gambling matching near:far 0.908  
19 cc.latefees fixed near 0.346  
20 cc.latefees fixed far 0.751  
21 cc.latefees fixed near:far 0.186  
22 cc.latefees bisection near 0.137  
23 cc.latefees bisection far 0.767  
24 cc.latefees bisection near:far 0.220  
25 cc.latefees matching near 0.967  
26 cc.latefees matching far 0.680  
27 cc.latefees matching near:far 0.269  
28 cc.subpay fixed near 0.727  
29 cc.subpay fixed far 0.294  
30 cc.subpay fixed near:far 0.931  
31 cc.subpay bisection near 0.649  
32 cc.subpay bisection far 0.674  
33 cc.subpay bisection near:far 0.474  
34 cc.subpay matching near 0.347  
35 cc.subpay matching far 0.200  
36 cc.subpay matching near:far 0.129  

Session 2

Planning

If 30 days is a good retest interval, since that's the front-end delay, let's try to start rerunning subjects 30 days, on average, since we first ran them.

I'm thinking of doing 5 waves of admitting 10 subjects per day (at 9 am, noon, 3 pm, 6 pm, and 9 pm). Everyone will then be admitted over the course of four days.

do.call(c, lapply(list(1:50, 51:100, 101:150, 151:200), function(i)
    median(as.Date(ss(sb, tv %in% 5:8)$began_t)[i] + 30)))
  x
1 2014-03-18
2 2014-03-20
3 2014-03-21
4 2014-03-22

Let's admit everyone on weekdays to reduce the chance of a weekend rush. That makes the appropriate days Tuesday, March 18 through Friday, March 21. We can make the HIT end on Monday evening (the 24th).

For implementation, let's use a fresh database and match up subject numbers (using worker IDs) post hoc. Let us also try to make the keys in the User table not overlap with any of those from last time so the User tables from the two databases can be easily merged once subject numbers have been corrected.

The HIT will require a Qualification that we'll grant to workers at the same time we notify them that the new HIT exists.

Score-for-score reliability

How well can we predict scores in rounds 2 and 3 using exactly the scores from round 1, with no model or other fanciness?

score.for.score.pred = function(dv, ses)
    `colnames<-`(value = qw(F.near, F.far, B.near, B.far, M.near, M.far), round(d = 2,
        do.call(cbind, lapply(list(itf.scores, itb.discs, itm.discs), function(d)
            t(aaply(acast(ss(d, s %in% ses), s ~ type ~ round), c(2), function(a) c(
#                SD = sd(a[,dv]),
#                RMSE = sqrt(mean((a[,dv] - a[,"round1"])^2)),
                PVAF = 1 - sum((a[,dv] - a[,"round1"])^2)/sum((a[,dv] - mean(a[,dv]))^2),
                "Abs err, median" = median(abs(a[,dv] - a[,"round1"])),
                "Abs err, .9 q" = num(quantile(abs(a[,dv] - a[,"round1"]), .9)),
                "Abs err, .95 q" = num(quantile(abs(a[,dv] - a[,"round1"]), .95)),
                Bias = mean(a[,"round1"] - a[,dv]),
                Kendall = cor(a[,dv], a[,"round1"], method = "kendall"),
                Pearson = cor(a[,dv], a[,"round1"]))))))))
score.for.score.pred("round2", good.s)
  F.near F.far B.near B.far M.near M.far
PVAF 0.70 0.49 0.53 0.64 0.78 0.82
Abs err, median 2.00 2.00 0.04 0.05 0.03 0.03
Abs err, .9 q 5.00 7.00 0.23 0.18 0.14 0.14
Abs err, .95 q 5.00 9.00 0.28 0.24 0.19 0.18
Bias 0.10 0.09 0.00 -0.01 -0.01 -0.01
Kendall 0.71 0.61 0.65 0.64 0.75 0.77
Pearson 0.85 0.72 0.77 0.81 0.89 0.91
score.for.score.pred("round3", good.s.s2)
  F.near F.far B.near B.far M.near M.far
PVAF 0.44 0.57 0.44 0.64 0.57 0.56
Abs err, median 2.00 2.00 0.06 0.04 0.06 0.05
Abs err, .9 q 5.00 5.00 0.16 0.19 0.19 0.21
Abs err, .95 q 7.00 6.00 0.21 0.23 0.28 0.26
Bias -0.37 -0.29 0.01 0.01 -0.02 -0.02
Kendall 0.66 0.65 0.66 0.67 0.65 0.62
Pearson 0.77 0.80 0.72 0.82 0.81 0.79

Judging from the PVAFs and considering that matching is the quickest and easiest test, perhaps matching is best. Matching also has the advantage that it is intuitive how lengthening the test should improve reliability.

Let's use a similar strategy to investigate nonstationarity by examining how well round-1 near scores predict round-2 far scores. (We use the DV from round 2 rather than 1 for better comparability with the first analysis.)

`colnames<-`(value = qw(F, B, M), round(d = 2,
    do.call(cbind, lapply(list(itf.scores, itb.discs, itm.discs), function(d)
       {iv = ss(d, s %in% good.s.s2 & round == "round1" & type == "near")[,ncol(d)]
        dv = ss(d, s %in% good.s.s2 & round == "round2" & type == "far")[,ncol(d)]
        c(
#            SD = sd(dv),
#            RMSE = sqrt(mean((dv - iv)^2)),
            PVAF = 1 - sum((dv - iv)^2)/sum((dv - mean(dv))^2),
            "Abs err, median" = median(abs(dv - iv)),
            "Abs err, .9 q" = num(quantile(abs(dv - iv), .9)),
            "Abs err, .95 q" = num(quantile(abs(dv - iv), .95)),
            Bias = mean(iv - dv),
            Kendall = cor(dv, iv, method = "kendall"))}))))
  F B M
PVAF 0.49 0.54 0.70
Abs err, median 2.00 0.04 0.04
Abs err, .9 q 5.00 0.22 0.15
Abs err, .95 q 7.80 0.27 0.21
Bias 0.35 -0.01 0.00
Kendall 0.66 0.66 0.76

It looks like near predicts far about as well as far does. That is, the predictive validity of near for far is about as good as the reliability of far. This seems to imply it is unnecessary to measure both. You might as well measure one of them twice.

The lack of bias here is also noteworthy.

Implications for prediction

How can we estimate what these reliabilities imply for prediction?

PVAFs around .6 imply a prediction-interval reduction factor of 1-1-.62, about 1/3. I would judge that as reasonably but not very useful for prediction. — No, wait, this calculation is fishy. The general notion of PVAF isn't the same thing as squared correlation.

By Equation (3.13) of de Gruijter and van der Kamp (2008) (p. 23), classical test theory implies that ρr=v, where ρ is the correlation between true scores of the predictor and criterion, r is the reliability coefficient of the predictor, v is predictive validity, and we are assuming the criterion is perfectly reliable. So a reliability of .8 multiplies the predictor–criterion correlation by .8=.90. Not bad.

However, correlations are hard to interpret, so let's think about prediction-interval reduction factors (PIRFs) instead. Some algebra (in Maxima)

expr: ev((1 - sqrt(1 - v^2)), v = rho * sqrt(r))$
eq: solve(((ulpirf = 1 - sqrt(1 - rho^2)) - 1)^2, rho^2)[1]$
ev(expr, eq);

yields the formula 1-sqrt(1-r*(2*ulpirf-ulpirf^2)) to express the predictor–criterion PIRF in terms of the underyling PIRF ulpirf and the reliability coefficient r.

pirf = function(r, ulpirf) 1-sqrt(1-r*(2*ulpirf-ulpirf^2))
round(d = 2, as.matrix(transform(
    data.frame(ulpirf = c(.15, 1/4, 1/2, 3/4, .9, .99)),
    "r.6" = pirf(.6, ulpirf),
    "r.8" = pirf(.8, ulpirf))))
  ulpirf r.6 r.8
1 0.15 0.09 0.12
2 0.25 0.14 0.19
3 0.50 0.26 0.37
4 0.75 0.34 0.50
5 0.90 0.36 0.54
6 0.99 0.37 0.55

We see that although rs of .6 and .8 severely limit the PIRF, they are enough to let the PIRF reach 1/4 and 1/3, respectively, provided ulpirf is at least 1/2. So we seem to have enough reliability to do meaningful prediction when the underlying relationship is strong, but probably not when it's moderate and certainly not when it's weak.

But see http://arfer.net/w/reliab-predacc.

Session-1 reliabilities as affected by attrition

For reviewer 2 of PaID submission 1.

Full sample:

score.for.score.pred.paper("round2", good.s.5to8)
  Near Far Near Far Near Far
PVAF 0.70 0.49 0.51 0.62 0.75 0.82
Abs err, median 2.00 2.00 0.04 0.05 0.03 0.03
Abs err, 10th %ile 5.00 7.00 0.23 0.18 0.14 0.13
Abs err, 5th %ile 5.00 9.00 0.28 0.24 0.20 0.18
Bias 0.10 0.09 0.00 -0.01 -0.01 -0.01
Kendall τ 0.71 0.61 0.65 0.63 0.74 0.78
Pearson r 0.85 0.72 0.76 0.81 0.88 0.91

Returners only:

score.for.score.pred.paper("round2",
   row.names(ss(sb, good & tv %in% 5:8 & !is.na(duration.s2))))
  Near Far Near Far Near Far
PVAF 0.72 0.67 0.63 0.71 0.82 0.89
Abs err, median 2.00 2.00 0.04 0.04 0.03 0.03
Abs err, 10th %ile 4.83 4.83 0.22 0.16 0.12 0.10
Abs err, 5th %ile 5.00 7.50 0.27 0.21 0.18 0.15
Bias 0.07 -0.19 0.01 0.00 0.00 -0.01
Kendall τ 0.74 0.73 0.67 0.68 0.77 0.82
Pearson r 0.86 0.82 0.81 0.85 0.91 0.95

Non-returners only:

score.for.score.pred.paper("round2",
   row.names(ss(sb, good & tv %in% 5:8 & is.na(duration.s2))))
  Near Far Near Far Near Far
PVAF 0.63 0.23 0.33 0.49 0.63 0.71
Abs err, median 2.00 2.00 0.06 0.06 0.03 0.03
Abs err, 10th %ile 4.87 7.00 0.23 0.21 0.15 0.17
Abs err, 5th %ile 5.00 8.60 0.27 0.25 0.20 0.21
Bias 0.13 0.40 0.00 -0.02 -0.02 -0.02
Kendall τ 0.65 0.46 0.59 0.54 0.68 0.70
Pearson r 0.80 0.56 0.67 0.74 0.85 0.86

To my surprise, there's a clear difference: reliability was somewhat higher among the returners than the non-returners.

What does this mean for the validity results? Well, it means the comparison we invoke between the one-month retest reliabilities and predictive validities is a bit unfair, because the samples are different (particularly, the sample for which we have reliabilities is likely more reliable than the whole sample would be). So, let's examine the predictive validity among this more reliable subsample.

Continuous, original:

prederror.continuous(sb2)
  BMI Exercise Healthy meals Savings
Standard deviation 5.00 6.88 0.335 0.170
Prediction error        
      Fixed 5.07 7.17 0.350 0.188
      Bisection 5.10 6.79 0.349 0.186
      Matching 5.12 7.20 0.355 0.187

Continuous, returners only:

prederror.continuous(ss(sb2, !is.na(duration.s2)))
  BMI Exercise Healthy meals Savings
Standard deviation 4.93 6.80 0.333 0.163
Prediction error        
      Fixed 5.09 7.17 0.350 0.181
      Bisection 5.11 7.01 0.352 0.179
      Matching 5.15 7.38 0.377 0.181

Discrete, original:

prederror.discrete(sb2)
  Tobacco use Gambling Flossing CC late fees CC subpayment
Base rate 0.76 0.83 0.36 0.75 0.60
Proportion correct          
      Fixed 0.76 0.82 0.31 0.72 0.59
      Bisection 0.76 0.82 0.31 0.75 0.58
      Matching 0.76 0.82 0.30 0.75 0.59

Discrete, returners only:

prederror.discrete(ss(sb2, !is.na(duration.s2)))
  Tobacco use Gambling Flossing CC late fees CC subpayment
Base rate 0.76 0.84 0.39 0.80 0.59
Proportion correct          
      Fixed 0.73 0.82 0.27 0.77 0.55
      Bisection 0.75 0.82 0.30 0.79 0.58
      Matching 0.75 0.84 0.36 0.79 0.63

Paper thinking

Questions to be answered:

  • Q: How well do the tests predict the criterion variables?
    • A: Not at all. Even the observed associations are small. These findings are consistent with the not-so-great correlations observed in past studies, but the use of cross-validation and focus on prediction (as opposed to association) is new, as well as the inclusion of nonstationarity.
  • Q: How reliable are the tests, both immediately and over 30 days?
    • A: Moderate, with Kendalls around .65. Pearson correlations for the 30-day interval are comparable to those for other personality tests.
    • Q: What does this imply for the use of these tests to predict other criterion variables?
      • A: "we seem to have enough reliability to do meaningful prediction when the underlying relationship is strong, but probably not when it's moderate and certainly not when it's weak"
  • Q: What nonstationarity is seen?
    • A: Indistinguishable from measurement error. There wasn't even a definite trend for nonstationarity in one direction or the other.
  • Q: In summary, what can be said about the measurement of time preferences?
    • A: Abstract intertemporal-choice tasks are unlikely to be able to predict domain-specific real-world variables, at least on their own. Fortunately, the matching test makes it easy to fit the measurement of time preferences into a larger battery in the hope that it can make the difference. Its reliability is probably good enough that unreliability won't ruin its utility. However, the theoretical appeal of measuring nonstationarity in addition to patience doesn't seem to translate to psychometric benefits.

NLSY analysis

Variables

Misc.

caseid.1979
id# (1-12686) 79
q1.3.a.m.1981
date of birth - month 81
q1.3.a.y.1981
date of birth - yr 81
sample.race.1979
racl/ethnic cohort /scrnr 79
sample.sex.1979
sex of r 79
ageatint.2002
age at interview date 2002
health.height.1985
height of r 85
debt.1a.ref.1.2008
r refuse previous inc-ast amounts? 2008
debt.1a.ref.2.2008
r refuse previous debt estimates? 2008
debt.2a.ref.1.2008
r refuse previous inc-ast amounts? 2008
debt.2a.ref.2.2008
r refuse previous debt estimates? 2008
debt.2d.ref.1.2008
r refuse previous inc-ast amounts? 2008
debt.2d.ref.2.2008
r refuse previous debt estimates? 2008
debt.3a.ref.2.2008
r refuse previous debt estimates? 2008
debt.3a.ref.1.2008
r refuse previous inc-ast amounts? 2008

Time preferences

nrow(nlsy)
  value
  7127
impatience.1.2006
amt r would wait 1 month to receive $1k 2006
impatience.2.2006
amt r would wait 1 year to receive $1k 2006

Note that we're ignoring impatience.[1, 2].sr00000[1, 2].2006, which had the subject try to provide a rough answer when they didn't provide an exact answer to the core impatience questions. There are relatively few such subjects, and more to the point, it is of course unclear how to code them alongside the subjects who gave exact answers.

rd(c(
    month = mean(nlsy$impatience.1.2006 > 5000),
    year = mean(nlsy$impatience.2.2006 > 5000)))
  value
month 0.010
year 0.086
ggplot(ss(nlsy, impatience.1.2006 < 5000)) +
   geom_densitybw(aes(log(impatience.1.2006)))

impatience1.png

Coding is a subtle issue—we might be able to exploit the fact that many subjects chose a few round answers, rather than treating impatience as having a monotonic effect. A random forest (or rather, bagged decision trees) can do this. Whether it makes senset to recode impatience so that (e.g.) elastic-net regression can exploit this depends on the sample size. Let's see what kinds of sample sizes we're going to get with the many DVs first.

Fitness

h50fl.2.000001.xrnd
how difficult to run a mile xrnd
h50fl.2.000002.xrnd
how difficult to walk several blocks xrnd
h50fl.2.000003.xrnd
how difficult to walk 1 block xrnd
h50fl.2.000004.xrnd
how difficult to sit 2 hours xrnd
h50fl.2.000005.xrnd
how difficult to get up after sitting xrnd
h50fl.2.000006.xrnd
how difficult to climb several flights xrnd
h50fl.2.000007.xrnd
how difficult to climb 1 flight xrnd
h50fl.2.000008.xrnd
how difficult to lift/carry 10 pounds xrnd
h50fl.2.000009.xrnd
how difficult to stoop kneel crouch xrnd
h50fl.2.000010.xrnd
how difficult to pick up dime xrnd
h50fl.2.000011.xrnd
how difficult to raise arms ovr shoulder xrnd
h50fl.2.000012.xrnd
how difficult to pull/push large ojbects xrnd
levels(nlsy$h50fl.2.000012.xrnd)
  x
1 Not at all difficult for you
2 A little difficult
3 Somewhat difficult
4 Very difficult/can't do
5 Don't do
d = rd(d = 2, data.frame(t(sapply(sprintf("h50fl.2.%06d.xrnd", 1:12), function(v) c(
    n = sum(!is.na(nlsy[,v])),
    prop.table(table(nlsy[,v])))))))
colnames(d) = substr(colnames(d), 1, 5)
d
  n Not.a A.lit Somew Very. Don.t
h50fl.2.000001.xrnd 4932 0.21 0.14 0.17 0.24 0.24
h50fl.2.000002.xrnd 5003 0.76 0.08 0.07 0.06 0.03
h50fl.2.000003.xrnd 5007 0.88 0.04 0.04 0.03 0.01
h50fl.2.000004.xrnd 5007 0.75 0.08 0.09 0.06 0.01
h50fl.2.000005.xrnd 5006 0.66 0.15 0.13 0.05 0.01
h50fl.2.000006.xrnd 4995 0.60 0.13 0.13 0.11 0.03
h50fl.2.000007.xrnd 5004 0.83 0.06 0.05 0.04 0.01
h50fl.2.000008.xrnd 5000 0.80 0.07 0.07 0.05 0.02
h50fl.2.000009.xrnd 5003 0.64 0.13 0.13 0.09 0.02
h50fl.2.000010.xrnd 5005 0.95 0.02 0.02 0.01 0.00
h50fl.2.000011.xrnd 5006 0.88 0.05 0.05 0.02 0.00
h50fl.2.000012.xrnd 5000 0.78 0.06 0.08 0.06 0.02

On the basis of the table and the item text, let's use:

  • h50fl.2.000001.xrnd, the difficulty of running a mile, dichotomizing as very hard / can't / don't vs. other options.
  • h50fl.2.000006.xrnd, the difficulty of climbing several flights of stairs, dichtomizing as not at all vs. other options.
  • r height in feet 2006
  • r height in inches 2006
  • how much does r weigh 2006
nrow(na.omit(nlsy[,qw(q11.10.a.2006, q11.10.b.2006, q11.9.2006)]))
  value
  6923
bmi = 703.06958 * with(nlsy, q11.9.2006 / (q11.10.a.2006 * 12 + q11.10.b.2006)^2)
bmi = bmi[!is.na(bmi)]
ggplot(data.frame(bmi)) +
  geom_densitybw(aes(log(bmi)))

nlsy-bmi-density.png

Some of those BMIs are quite extreme, but it's hard to guess which are errors. So let's not remove any data, but evaluate predictions with absolute error rather than squared error.

q11.genhlth.1a.1.2006
freq vigorous exercise 10 > min 2006
q11.genhlth.1a.2.2006
freq vigorous exercise 10 > min 2006
q11.genhlth.1c.1.2006
length vigorous activities 2006
q11.genhlth.1c.2.2006
length vigorous activities 2006
q11.genhlth.2a.1.2006
freq light mod exercise 10 > min 2006
q11.genhlth.2a.2.2006
freq light mod exercise 10 > min 2006
q11.genhlth.2c.1.2006
length light mod activities 10 min 2006
q11.genhlth.2c.2.2006
length light mod activities 10 min 2006
q11.genhlth.3a.1.2006
freq strngth exercise 10 > min 2006
q11.genhlth.3a.2.2006
freq strngth exercise 10 > min 2006
nlsy.exer.vigorous = exercise.freq(nlsy,
   "q11.genhlth.1a.1.2006", "q11.genhlth.1a.2.2006",
   "q11.genhlth.1c.1.2006", "q11.genhlth.1c.2.2006")

c(sum(!is.na(nlsy.exer.vigorous$min.per.year)),
    rd(mean(nlsy.exer.vigorous$min.per.year == 0, na.rm = T)))
  value
  6855 0.319
ggplot(ss(nlsy.exer.vigorous, min.per.year > 0)) +
  geom_densitybw(aes(log(min.per.year))) +
  geom_densitybw(aes(log(min.per.year)), adjust = 10, color = "blue")

nlsy-vigorous-exercise-min-per-year.png

nlsy.exer.light = exercise.freq(nlsy,
   "q11.genhlth.2a.1.2006", "q11.genhlth.2a.2.2006",
   "q11.genhlth.2c.1.2006", "q11.genhlth.2c.2.2006")

c(sum(!is.na(nlsy.exer.light$min.per.year)),
    rd(mean(nlsy.exer.light$min.per.year == 0, na.rm = T)))
  value
  6789 0.262
ggplot(ss(nlsy.exer.light, min.per.year > 0)) +
  geom_densitybw(aes(log(min.per.year))) +
  geom_densitybw(aes(log(min.per.year)), adjust = 10, color = "blue")

nlsy-light-exercise-min-per-year.png

Vigorous and light exercise are probably best predicted with double models that first classify cases as zero or nonzero, and then, in the case of nonzero, estimate the exact number.

nlsy.exer.strength = exercise.freq(nlsy,
   "q11.genhlth.3a.1.2006", "q11.genhlth.3a.2.2006")

c(sum(!is.na(nlsy.exer.strength$freq.per.year)),
    rd(mean(nlsy.exer.strength$freq.per.year == 0, na.rm = T)))
  value
  7097 0.628

Since the frequency for strength exercise isn't far from 50%, let's just predict it dichotomously.

These items are derived:

q11.genhlth.1b.2006
r ever do vigorous activities? 2006
q11.genhlth.2b.2006
r ever do light mod activities? 2006

These are made obsolete for our purposes by the 2006 versions:

q11.9e.2000
freqncy light physical activities? 2000
q11.9f.2000
freq of vigors phys activ/sports? 2000
q11.genhlth.1a.000001.2002
freq vigorous exercise 10 > min 2002
q11.genhlth.1a.000002.2002
freq vigorous exercise 10 > min 2002
q11.genhlth.1c.000001.2002
length vigorous activities 10 min 2002
q11.genhlth.1c.000002.2002
length vigorous activities 10 min 2002
q11.genhlth.2a.000002.2002
freq light mod exercise 10 > min 2002
q11.genhlth.2a.000001.2002
freq light mod exercise 10 > min 2002
q11.genhlth.2c.000001.2002
length light mod activities 10 min 2002
q11.genhlth.2c.000002.2002
length light mod activities 10 min 2002
q11.genhlth.3a.000001.2002
freq strngth exercise 10 > min 2002
q11.genhlth.3a.000002.2002
freq strngth exercise 10 > min 2002

Healthy eating

q11.genhlth.7a.2006
read nutritional info on food 2006
rd(as.data.frame(prop.table(table(nlsy$q11.genhlth.7a.2006, useNA = "always"))))
  Var1 Freq
1 DON'T BUY FOOD 0.010
2 Always 0.301
3 Often 0.180
4 Sometimes 0.192
5 Rarely 0.095
6 Never 0.222
7   0.001

This suggests a trichotomous scheme with the levels "Always or often", "Sometimes", and "Rarely or never". But it would be nice for all my non-numeric DVs to be dichotomous, so let's dichotomize it as Always and Often versus others.

q11.genhlth.7c.1.2008
times ate fast food past 7 days 2008
q11.genhlth.7c.2.2008
time unit eating fast food past 7 days 2008
q11.genhlth.7f.1.2008
times having soft drk or soda past 7 days 2008
q11.genhlth.7f.2.2008
time unit having soft drink or soda past 7 days 2008
fastfood = with(nlsy, ifelse(q11.genhlth.7c.1.2008 == 0,
    0,
    q11.genhlth.7c.1.2008 * ifelse(q11.genhlth.7c.2.2008 == "Per day", 7, 1)))
fastfood = fastfood[!is.na(fastfood)]
fastfood = pmin(fastfood, 3*7)
  # If a subject entered a value greater than 3 times a day every
  # day a week, put it back there.
table(fastfood)
  count
0 2425
1 1849
2 1109
3 631
4 264
5 208
6 39
7 215
8 13
9 7
10 14
12 1
14 24
15 1
20 3
21 20

Since this distribution is very skewed, let's dichotomize it as 0 vs. everything else.

soda = with(nlsy, ifelse(q11.genhlth.7f.1.2008 == 0,
    0,
    q11.genhlth.7f.1.2008 * ifelse(q11.genhlth.7f.2.2008 == "Per day", 7, 1)))
soda = fastfood[!is.na(soda)]
soda = pmin(soda, 3*7)
  # If a subject entered a value greater than 3 times a day every
  # day a week, put it back there.
table(soda)
  count
0 2337
1 1783
2 1062
3 603
4 253
5 199
6 36
7 204
8 13
9 7
10 14
12 1
14 23
20 2
21 19

The same.

Sleep

h50slp.1.xrnd
how much sleep r gets at night on weekdays-hrs xrnd
h50slp.1b.xrnd
how much sleep r gets at night on weekdays-mnts xrnd
ggplot(nlsy) +
    geom_densitybw(aes(h50slp.1.xrnd + h50slp.1b.xrnd/60))

nlsy-sleep-weekdays.png

Looks pretty good as is. Let's just use absolute error.

h50slp.2.xrnd
how much sleep r gets at night on weekends-hrs xrnd
h50slp.2b.xrnd
how much sleep r gets at night on weekends-mnts xrnd
with(nlsy, head(rd(sort(
    h50slp.2.xrnd + h50slp.2b.xrnd/60, dec = T)), 20))
  value
  24 24 24 18.3 16 16 15 15 14 14 13 13 13 13 13 12 12 12 12 12

In the weekend data, there are a few rather extreme values. Let's clip them to the maximum of the weekday data, 16 hours.

ggplot(transform(nlsy, x =
    pmin(16, h50slp.2.xrnd + h50slp.2b.xrnd/60))) +
    geom_densitybw(aes(x))

nlsy-sleep-weekends.png

Same as before.

Medical conscientiousness

q11.79.2006
r have health/hosp plan 2006
rd(prop.table(table(useNA = "always", nlsy$q11.79.2006)))
  count
FALSE 0.190
TRUE 0.810
NA 0.001

Looks barely usable.

q11.genhlth.4c.m.000001.2008
med test in past 24 mo - flu shot 2008
q11.genhlth.4c.f.000001.2008
med test in past 24 mo - flu shot 2008
table(useNA = "always", with(nlsy, ifelse(nlsy$sample.sex.1979 == "FEMALE",
    q11.genhlth.4c.f.000001.2008,
    q11.genhlth.4c.m.000001.2008)))
  count
FALSE 4668
TRUE 2184
NA 275

Looks good.

q11.genhlth.4e.m.000001.2008
see/talk to dentist in past 24 months 2008
q11.genhlth.4e.f.000002.2008
see/talk to dentist in past 24 months 2008
table(useNA = "always", with(nlsy, ifelse(nlsy$sample.sex.1979 == "FEMALE",
    q11.genhlth.4e.f.000002.2008,
    q11.genhlth.4e.m.000001.2008)))
  count
FALSE 2273
TRUE 4583
NA 271

Looks good.

q11.genhlth.5a.2.2008
how many times r brush teeth in usual week 2008
table(nlsy$q11.genhlth.5a.2.2008, useNA = "always")
  count
0 15
1 12
2 30
3 24
4 26
5 33
6 10
7 1175
8 21
9 18
10 228
11 11
12 86
13 10
14 3643
15 80
16 48
17 22
18 42
19 5
20 140
21 687
22 1
23 1
24 8
25 22
26 3
27 1
28 75
30 27
35 13
36 1
40 2
42 6
50 3
60 1
63 1
70 10
71 2
73 1
74 1
77 5
100 1
NA 576

Not surprisingly, there are spikes at multiples of 7. I guess the reasonable thing to do is dichtoimze according to whether or not subjects are meeting the usual recommendation of brushing twice a day:

rd(mean(nlsy$q11.genhlth.5a.2.2008 >= 2*7, na.rm = T))
  value
  0.741
q11.genhlth.5a.3.2008
how many times r floss in usual week 2008
table(nlsy$q11.genhlth.5a.3.2008, useNA = "always")
  count
0 1626
1 607
2 573
3 523
4 247
5 287
6 56
7 1742
8 12
9 4
10 83
11 3
12 14
13 2
14 527
15 20
16 5
17 2
18 6
19 1
20 29
21 135
25 4
27 1
28 13
30 4
35 8
40 3
41 1
42 1
50 1
60 1
100 3
NA 583
table(useNA = "always", cut(nlsy$q11.genhlth.5a.3.2008,
    c(0, 1, 7, Inf), right = F))
  count
[0,1) 1626
[1,7) 2293
[7,Inf) 2625
NA 583

We can use the same trichotomy as in the MTurk study. Or, for simplicity, dichotomize according to whether or not the rate attains the usual recommendation of once per week.

Colonoscopies are probably not such a great DV because there isn't a universal recommendation to get them.

q11.genhlth.4c.m.000008.2008
med test in past 24 mo - colonoscopy 2008
q11.genhlth.4c.f.000008.2008
med test in past 24 mo - colonoscopy 2008

Most subjects have teeth, of course.

q11.genhlth.5a.1.2008
r have any of own natural teeth? 2008

Drug use

Tobacco
cig.chk1.2008
r smoked 100 cigarettes 2008
cig.chk2.2008
r ever smoked cigarettes daily 2008
ds.2.2008
smokd at least 100 cigrts in life? 2008
ds.3.1998
r ever smoked daily? 1998
ds.3a.2008
r ever smoked cigarettes daily 2008
ds.4.1998
age when 1st started smoking daily? 1998
ds.5.2008
does r currently smoke daily? 2008
ds.7.2008
# cigarettes smoked per day 2008
q11.genhlth.4b.000003.2006
r asked about smoking at last exam 2006
q11.genhlth.4b.3.2002
asked about smoking at last exam 2002
smoked100 = with(nlsy, cig.chk1.2008 | ds.2.2008)
table(smoked100, useNA = "always")
  count
FALSE 2910
TRUE 3945
NA 272

So we can predict dichotomously whether a subject has smoked 100 cigarettes in their life by 2008.

smokeddaily = with(nlsy, cig.chk2.2008 | ds.3a.2008)
table(smoked100, smokeddaily, useNA = "always")
  FALSE TRUE NA
FALSE 0 0 2910
TRUE 207 3738 0
NA 2 2 268

Subjects who hadn't smoked 100 cigarettes weren't asked if they'd ever smoked daily, and among thsoe who had smoked 100 cigarettes, the large majority had smoked daily. So let's ignore the ever-smoked-daily question in favor of ever-smoked-100.

It does seem to make sense to try predict smoking behavior now, though:

table(useNA = "always", ifelse(!smoked100 | !smokeddaily,
   "NOT AT ALL",
    char(nlsy$ds.5.2008)))
  count
DAILY 1464
NOT AT ALL 4988
OCCASIONALLY 403
NA 272

Let's combine OCCASIONALLY and DAILY to get a dichotomous measure of smoking in 2008.

Alcohol
alch.1.1989
alchl-ever had a drnk 89
alch.2.1983
alchl-age start drnk 83
alch.2a.1983
alchl-age start drnk 1 time wk 83
delin.3.1980
ill act # drank alchl past yr <18 80
q12.3.2006
r consumed alcohol in last 30 days? 2006
q12.4.2006
freq of 6 or more drinks at once in last 30 days 2006
q12.5.2006
# of days r drank alcohol last 30 days 2006
q12.6.2006
# of drinks r has on average day 2006
q12.7n.1994
arrested, in police trouble 94 [from drinking]
q12.8.2006
anyone else present during alcohol use questions? 2006

We have some questions from the same year as the impatience questions (2006), so let's use those.

table(nlsy$q12.3.2006, useNA = "always")
  count
FALSE 3338
TRUE 3714
NA 75

This item, of whether the subject drank alcohol in the past 30 days, looks good.

with(nlsy, table(useNA = "always",
    factor(ifelse(q12.3.2006, char(q12.4.2006), "Never in the last 30 days"),
        levels = levels(q12.4.2006))))
  count
Never in the last 30 days 6051
Less often than once a week 480
1 or 2 times per week 335
3 or 4 times per week 99
5 or 6 times per week 46
Everyday 34
NA 82
rd(prop.table(with(nlsy, table(useNA = "always",
    ifelse(q12.3.2006,
      q12.4.2006 != "Never in the last 30 days",
      F)))))
  count
FALSE 0.849
TRUE 0.139
NA 0.012

This gets us a dichotomous measure of heavy drinking. The base rate is low, but we do, after all, have a large sample.

drinks.last.month = with(nlsy,
  ifelse(!q12.3.2006, 0,
  q12.5.2006 * q12.6.2006))
drinks.last.month = drinks.last.month[!is.na(drinks.last.month)]
rd(mean(drinks.last.month == 0))
  value
  0.475
ggplot(data.frame(d = drinks.last.month[drinks.last.month > 0])) +
    geom_bar(aes(log(d), ..ncount..), binwidth = .05) +
    geom_densitybw(aes(log(d)), fill = "blue", alpha = .5, adjust = 15)
#    scale_x_log10()

nlsy-drinks-last-month-bars.png

This is probably about the best we can do.

Cannabis
delin.12.1980
ill act smoked marj past yr 80
drug.3.1984
drug month use marj/hash 1st tm 84
ds.8.1998
tot occasion r use marijuana 1998
ds.9.1998
age 1st time used marijuana 1998
ds.10.1998
m-rcnt time used marijuana 1998
ds.11.1998
nmbr times pot past 30 days 1998
table(nlsy$ds.8.1998, useNA = "always")
  count
100 OR MORE TIMES 1039
50 TO 99 TIMES 450
11 TO 49 TIMES 681
6 TO 10 TIMES 481
3 TO 5 TIMES 508
1 OR 2 TIMES 953
NEVER 2550
NA 465

Dichotomizing as NEVER vs. not NEVER looks good.

with(nlsy, rd(prop.table(table(
    ifelse(ds.8.1998 == "NEVER", F,
    ds.10.1998 == "LESS THAN ONE MONTH (30 DAYS) AGO")))))
  count
FALSE 0.944
TRUE 0.056

This measures whether subjects have used cannabis in the past 30 days, but the base rate seems too low.

Nor can we really measure the number of times cannabis was used in the past 30 days conditional on at least one use, because ds.11.1998 is also discretized:

table(nlsy$ds.11.1998, useNA = "ifany")
  count
EVERY DAY 77
5-6 DAYS PER WEEK 32
3-4 DAYS PER WEEK 58
1-2 DAYS PER WEEK 71
LESS OFTEN THAN ONCE A WEEK 133
NA 6756
Cocaine
drug.22.1984
drug use cocaine 84
ds.12.1998
how many occasions used cocaine 1998
ds.13.1998
age 1st time used cocaine 1998
ds.14.1998
m-rcnt time used cocaine 1998
ds.15.1998
occasion used cocaine past 30 days 1998
ds.16.1998
how many occasions used crack 1998
ds.17.1998
age 1st time used crack 1998
ds.18.1998
m-rcnt time used crack 1998
ds.19.1998
occasion used crack past 30 days 1998
table(nlsy$ds.12.1998 != "NEVER", useNA = "ifany")
  count
FALSE 5132
TRUE 1558
NA 437

This is a dichotomous measure of lifetime cocaine use. As in the case of cannabis, this is the most we're going to get.

rd(prop.table(table(nlsy$ds.16.1998 != "NEVER")))
  count
FALSE 0.933
TRUE 0.067

For crack, even the lifetime base rate is quite low.

Other drugs
delin.13.1980
ill act # used drugs past yr 80
table(useNA = "always", nlsy$delin.13.1980 != "NEVER")
  count
FALSE 5514
TRUE 1180
NA 433

This is a dichotomous measure of illegal drug use other than cannabis. In the year before the 1980 interview.

drug.12.1984
drug age 1st use amphtmn/stmlnts 84
ds.27.1998
used stimulants w/o doctor's instr 1998
rd(prop.table(table(useNA = "always", nlsy$ds.27.1998)))
  count
FALSE 0.835
TRUE 0.106
NA 0.058

Borderline. But perhaps worth including for variety.

drug.15.1984
drug age 1st use brbturts/sedtvs 84
drug.18.1984
drug age 1st use trnqlzrs 84
drug.21.1984
drug age 1st use psychdlics 84
drug.25.1984
drug age 1st use heroin 84
drug.31.1984
drug age 1st use inhalants 84
drug.26.1984
drug use other narcotics 84
drug.27.1984
drug #tms use other narcotics 84
drug.28.1984
drug age 1st use other narcotics 84
drug.32.1984
drug use any other drugs 84
drug.33.1984
drug #tms use other drugs 84
drug.34.1984
drug age 1st use other drugs 84

Mostly, the base rates seem to be too low:

rd(mapcols(nlsy[,sprintf("drug.%2d.1984", c(15, 18, 21, 25, 31, 27, 28, 33, 34))],
    function(v) mean(!is.na(v))))
  value
drug.15.1984 0.067
drug.18.1984 0.054
drug.21.1984 0.075
drug.25.1984 0.006
drug.31.1984 0.020
drug.27.1984 0.037
drug.28.1984 0.036
drug.33.1984 0.005
drug.34.1984 0.005

Sexual behavior

mfer.15.1983
male age 1st had intercourse 83
mfer.15.1984
male age 1st had intercourse 84
mfer.15.1985
male age 1st had intercourse 85
ffer.92.1984
f age 1st had sexual intercourse 83
ffer.92.1983
f age 1st had sexual intercourse 84
ffer.92.1985
f age 1st had sexual intercourse 85
age.first.sex = with(nlsy, ifelse(sample.sex.1979 == "FEMALE",
  ifna(ffer.92.1983, ifna(ffer.92.1984, ffer.92.1985)),
  ifna(mfer.15.1983, ifna(mfer.15.1984, mfer.15.1985))))
table(!is.na(age.first.sex))
  count
FALSE 564
TRUE 6563
ggplot(data.frame(x = age.first.sex[!is.na(age.first.sex)])) +
    geom_bar(aes(ordered(x, levels = min(x) : max(x)))) +
    scale_x_discrete(drop = F) +
    no.gridlines()

nlsy-age-first-sex-bars.png

This measures the age of first "sexual intercourse", conditional on having had such by the 1985 interview. Looks good.

For the rest, it isn't really clear what to expect in terms of patience or self-control. The relatively complex way the NLSY measures reproduction and reproductive intentions makes things difficult.

ffer.15.1983
f #tms preg (no live birth lint) 83
ffer.17.1982
f #tms preg b4 birth child#1 82
ffer.23.1982
f #tms preg birth last child 82
mfer.12.1985
male ever had sexual intercourse 85
mfer.13.1983
male had intercourse past month 83
ffer.90.1985
f ever had sexual intercourse 85
ffer.91.1983
f sexual intercourse past mo 83
mfer.26.1985
male #tms intercourse past month 85
mfer.28.1990
male use c-cptn during last mo 90
ffer.92a.1985
f use b-dt yr/1st intercourse 85
ffer.92.m.1985
f mo sexual intercourse 1st time 85
ffer.137.1985
f #tms intercourse past mo 85
ffer.139.1990
f c-cptn during last mo 90
q9.64g.2000
r answer expecting information? 2000
q9.65.000001.2006
contraception meth pill 2006
q9.65.000002.2006
contraception meth condom 2006
q9.65.000003.2006
contraception meth foam 2006
q9.65.000004.2006
contraception meth jelly/cream 2006
q9.65.000005.2006
contraception meth suppstry/insert 2006
q9.65.000006.2006
contraception meth diaphragm 2006
q9.65.000007.2006
contraception meth douching 2006
q9.65.000008.2006
contraception meth iud/coil/loop 2006
q9.65.000009.2000
contraception meth f sterilization 2000
q9.65.000010.2000
contraception meth m sterilization 2000
q9.65.000011.2006
contraception meth family planning 2006
q9.65.000012.2006
contraception meth rhythm 2006
q9.65.000013.2006
contraception meth withdrawal 2006
q9.65.000014.2006
contraception meth sponge 2006
q9.65.000015.2006
contraception meth abstinence 2006
q9.65.000017.2006
contraception meth norplant 2006
q9.65.000018.2004
contraception meth cervical cap 2004
q9.65.000016.2006
contraception meths other 2006
q9.64h.2006
use any contraception dur lst mo? 2006
q9.65.000020.2006
contraception meth hysterectomy 2006
q9.65.000021.2006
contraception meth depo pravara 2006

Divorce

mobg2m.xrnd
mon began 2nd marriage xrnd
yrbg2m.xrnd
year began 2nd marrg xrnd
mobg3m.xrnd
month began 3rd marrg xrnd
yrbg3m.xrnd
year began 3rd marrg xrnd

Counting divorces is too dicey: it seems that variables about numbers of marriages are only derived from changes in marital status reported at each interview.

We could, however, try dichotomizing by the number of subjects who have ever divorced (or separated) at least once.

mar.1.1979
marital status 79
marstat.key.1979
marital status 79
marstat.key.1980
marital status 80
marstat.key.1981
marital status 81
marstat.key.1982
marital status 82
marstat.key.1983
marital status 83
marstat.key.1984
marital status 84
marstat.key.1985
marital status 85
marstat.key.1986
marital status 86
marstat.key.1987
marital status 87
marstat.key.1988
marital status 88
marstat.key.1989
marital status 89
marstat.key.1990
marital status 90
marstat.key.1991
marital status 91
marstat.key.1992
marital status 92
marstat.key.1993
marital status 93
marstat.key.1994
marital status 94
marstat.key.1996
marital status 96
marstat.key.1998
marital status 1998
marstat.key.2000
marital status 2000
marstat.key.2002
marital status 2002
marstat.key.2004
marital status 2004
marstat.key.2006
marital status 2006
marstat.key.2008
marital status 2008
marstat.key.2010
marital status 2010
marstat.key.2012
marital status 2012
table(c(recursive = T, lapply(
    nlsy[,grep("^marstat\\.key", colnames(nlsy), val = T)], levels)))
  count
0: 0 NEVER MARRIED 7
0: NEVER MARRIED 1
1: 1 MARRIED 7
1: MARRIED 1
2: 2 SEPARATED 7
2: SEPARATED 1
3: 3 DIVORCED 7
3: DIVORCED 1
6: 6 WIDOWED 7
6: WIDOWED 1
Divorced 2
DIVORCED 15
Married 2
MARRIED 15
Never Married 2
NEVER MARRIED 15
REMARRIED 14
Separated 2
SEPARATED 15
Widowed 2
WIDOWED 15

So much for consistency in coding. :P

table(useNA = "always", maprows(
    nlsy[,grep("^marstat\\.key", colnames(nlsy), val = T)],
    function(v)
       {v = tolower(char(v))
        any(grepl("divorce", v) | grepl("separ", v))}))
  count
FALSE 3822
TRUE 3305
NA 0

Slightly under half have ever divorced. Sounds about right.

Crime (not drugs)

police.1.1980
stop by police o/than min trafic off 80
rd(prop.table(table(useNA = "always", nlsy$police.1.1980)))
  count
FALSE 0.798
TRUE 0.171
NA 0.031

Looks good.

police.3.1980
ever convicted on ilgl act charges 80
police.3a.1980
times convicted ilgl act excld minor 80
police.3b.1980
age @ time 1st ilgl act conviction 80
police.3e.1980
ever convicted ilgl act adlt court 80
rd(prop.table(table(useNA = "always", nlsy$police.3.1980)))
  count
FALSE 0.922
TRUE 0.047
NA 0.03

Quite a low base rate, but as the only representative of actually committing crime, it's probably worth including anyway.

Income

tnfi.trunc.2006
total net family income 2006
income = nlsy$tnfi.trunc.2006
ggplot(data.frame(v = income[!is.na(income)])) +
    geom_bar(aes(sqrt(v), ..ncount..), binwidth = 2)

nlsy-income-bars.png

Looks good except for those spikes, which don't comprise much of the data:

rd(mean(nlsy$tnfi.trunc.2006 < 0 | nlsy$tnfi.trunc.2006 > 278500, na.rm = T))
  value
  0.017

Savings

income.21.1982
r/sp have any savings 82
q13.121.2000
any money in savings accounts? 2000
income.35.1988
r/sp set aside money for savings 88
income.35a.trunc.1988
tot amt set aside for savings 87 88
with(nlsy, rd(prop.table(table(useNA = "always", q13.121.2000))))
  count
FALSE 0.264
TRUE 0.665
NA 0.071

The question text here is "Do you (and your spouse) have any money in savings or checking accounts, savings and loan companies, money market funds, credit unions, U.S. savings bonds, individual retirement accounts (IRA or Keogh), or certificates of deposit, common stock, stock options, bonds, mutual funds, rights to an estate or investment trust, or personal loans to others or mortgages you hold (money owed to you by other people)?". It's kind of surprising there are so many FALSE answers. But I'll take it.

q13.123.2000
any money in iras or keough? 2000
q13.123b.2000
any money in tax-defrd plans? 2000
q13.123d.2000
any money in cds,loans,mortg 2000

The first two of these are retirement accounts, which a better indication of deliberately saving money in the long term than, for example, having a checking account. So let's look at those specifically.

with(nlsy, rd(prop.table(table(useNA = "always",
    q13.123.2000 | q13.123b.2000))))
  count
FALSE 0.441
TRUE 0.485
NA 0.074

Let's use these dichotomous variables in preference to amount variables, which, after all, should depend a lot on income and age.

qes1p.10.01.1996
pay pct rs contrib 1st rtrmnt plan 1 96
qes1p.11.01.1996
tim u rs set contrb 1st rtrmnt plan 1 96
qes1p.12.01.1996
set amt rs contrib 1st rtrmnt plan 1 96
qes1p.10.02.1996
pay pct rs contrib 2nd rtrmnt plan 1 96
qes1p.11.02.1996
tim u rs set contrb 2nd rtrmnt plan 1 96
qes1p.12.02.1996
set amt rs contrib 2nd rtrmnt plan 1 96
qes1p.10.03.1996
pay pct rs contrib 3rd rtrmnt plan 1 96
qes1p.11.03.1996
tim u rs set contrb 3rd rtrmnt plan 1 96
qes1p.12.03.1996
set amt rs contrib 3rd rtrmnt plan 1 96
qesp.13.01.01.1998
amt in pnsn/rtrmnt acct now l1,1 1998
qesp.13.01.02.1998
amt in pnsn/rtrmnt acct now l1,2 1998
qesp.13.01.03.1998
amt in pnsn/rtrmnt acct now l1,3 1998
qesp.13.02.01.1998
amt in pnsn/rtrmnt acct now l2,1 1998
qesp.13.02.02.1998
amt in pnsn/rtrmnt acct now l2,2 1998
qesp.13.02.03.1998
amt in pnsn/rtrmnt acct now l2,3 1998
qesp.13.03.01.1998
amt in pnsn/rtrmnt acct now l3,1 1998
qesp.13.03.02.1998
amt in pnsn/rtrmnt acct now l3,2 1998
qesp.13.04.01.1998
amt in pnsn/rtrmnt acct now l4,1 1998
qesp.13.04.02.1998
amt in pnsn/rtrmnt acct now l4,2 1998

These are hypothetical questions:

socsec.2.2004
ss prog or replace with pers ret 2004
socsec.3a.000001.2004
personal acct investments - stocks 2004
socsec.3a.000002.2004
personal acct investments - private bonds 2004
socsec.3a.000003.2004
personal acct investments - govt bonds 2004
socsec.3b.1.2004
any type pers investmts chosen 2004
socsec.3c000001.2004
% of ret acct put into investment 2004
socsec.3c000002.2004
% of ret acct put into investment 2004
socsec.3c000003.2004
% of ret acct put into investment 2004
socsec.4.2004
retirement acct - lump sum or monthly payment 2004

These are better replaced with equivalent questions asked in 1996:

qes1p.10.1.1994
pay pct rs contrib 1st rtrmnt plan 94
qes1p.11.1.1994
tim u rs set contrib 1st rtrmnt plan 94
qes1p.12.1.1994
set amt rs contrib 1st rtrmnt plan 94
qes1p.10.2.1994
pay pct rs contrib 2nd rtrmnt plan 94
qes1p.11.2.1994
tim u rs set contrib 2nd rtrmnt plan 94
qes1p.10.3.1994
pay pct rs contrib 3rd rtrmnt plan 94
qes1p.11.3.1994
tim u rs set contrib 3rd rtrmnt plan 94
qes1p.12.3.1994
set amt rs contrib 3rd rtrmnt plan 94

Debt

ps.1.2008
r miss pay or been late for 5 years 2008
with(nlsy, rd(prop.table(table(useNA = "always", ps.1.2008))))
  count
FALSE 0.755
TRUE 0.204
NA 0.042

Looks good.

debt.1.2008
r-sp has/owes money on credit cards 2008
debt.1a.trunc.2004
tot bal owed on all credit cards 2004
debt.1a.imputed.2008
tot bal owed on all credit cards 2008
debt.1a.2008
tot bal owed on all credit cards 2008
ps.2.2008
# of credit cards owes maximum amt 2008
with(nlsy, rd(prop.table(table(useNA = "always",
   ifelse(debt.1.2008, debt.1a.2008 > 0, F)))))
  count
FALSE 0.492
TRUE 0.435
NA 0.073

Also looks good.

cc.debt = nlsy$debt.1a.2008
cc.debt = cc.debt[!is.na(cc.debt) & cc.debt > 0]
ggplot(data.frame(cc.debt)) +
    geom_bar(aes(log(cc.debt), ..ncount..), binwidth = .1) +
    geom_densitybw(aes(log(cc.debt)), fill = "blue", alpha = .5)

nlsy-ccdebt-bars.png

Looks decent.

with(nlsy, rd(prop.table(table(useNA = "always", ps.2.2008 > 0))))
  count
FALSE 0.844
TRUE 0.107
NA 0.049

It's doable.

debt.3.2008
r-sp owes money to oth bus 2008
debt.3a.trunc.2008
tot amt r-sp owes to other bus 2008
debt.3a.2010
tot amt r-sp owes to other bus 2010
with(nlsy, rd(prop.table(table(useNA = "always", debt.3.2008))))
  count
FALSE 0.761
TRUE 0.195
NA 0.044

Looks decent.

debt.4.2008
r-sp ow $1000-more to pers/inst/co 2008
debt.4a.trunc.2008
amt of debt owto oth pers/instit/co 2008
debt.4a.2010
amt of debt owto oth pers/instit/co 2010
with(nlsy, rd(prop.table(table(useNA = "always", debt.4.2008))))
  count
FALSE 0.907
TRUE 0.050
NA 0.043

The base rate's too low.

q13.141.2008
any mony left aft all debts paid? 2008
with(nlsy, rd(prop.table(table(useNA = "always", q13.141.2008))))
  count
Have something left over 0.660
Break even 0.180
Be in debt 0.109
NA 0.052

That middle category is problematic, because given the granularity of assets and debts, there's no way more than a handful of people could truly break even. I expect that people said they would break even because they would roughly break even, and of course, different people should've had different ideas of how close the sum had to be to 0 in order to feel it was close enough. How annoying. I think the only thing to do is to lump the middle category in with "Have something left over", so then we get roughly the dichotomous measure I wanted, of whether subjects owe more than they own.

It's not clear what to expect with regard to student loans, os let's ignore that kind of debt:

debt.2.2008
r-sp resp for make student loan pay 2008
debt.2a.trunc.2004
tot amt r-sp owes on student loans 2004
debt.2c.2008
r-sp make studnt loan pay for child 2008
debt.2d.trunc.2004
amt ow on studnt loans for children 2004
debt.2a.2008
tot amt r-sp owes on student loans 2008
debt.2d.2008
amt ow on studnt loans for children 2008

Only a few subjects got these questions:

debt.1a.sr000001.2008
tot bal owed on all credit cards 2008
debt.1a.sr000002.2008
tot bal owed on all credit cards 2008
debt.1a.uab.a.2008
amt owed on cc accounts > ent amt 2008
debt.1a.uab.b.2008
amt owe on cc accounts > $5k 2008
debt.1a.uab.c.2008
amt owed on credit card accounts > $30k 2008
debt.2a.uab.a.2004
amt ow on studnt loans > ent amt 2004
debt.2a.uab.b.2004
tot amt owed on student loans > 5k 2004
debt.2a.uab.c.2008
tot amt owed on student loans > 30k 2008
debt.2d.uab.a.2008
amt ow on loans for childrn >en amt 2008
debt.3a.uab.a.2008
amt ow to oth bus >ent amt 2008
debt.3a.uab.b.2008
tot amt owed to oth bus >$5k 2008
debt.3a.uab.c.2008
tot amt owed to oth bus >$30k 2008
debt.4a.uab.a.2008
amt of debt ow to oth >ent amt 2008
debt.4a.uab.b.2008
tot amt of debt owed to oth >$5k 2008
debt.4a.uab.c.2008
tot amt of debt owed to oth >$30k 2008

These are procedural codes:

debt.4a.ref.1.2008
r refuse previous inc-ast amounts? 2008
debt.2b.2008
r report bio/adopt/stepchildren 2008

Putting it all together

# Default rules, by data type:
# - logical → binary
# - ordered → ordinal
# - numeric → abs_error

# Evaluate continuous predictions by absolute error, not squared
# error.
# That implies we should use quantile regression instead of
# least squares, so we can predict conditional medians instead of
# condtional means.
# quantreg::rq, quantregForest

isin = function(needle, haystacks)
# Like %in%, but NAs in the first argument return NA, not FALSE.
   ifelse(is.na(needle), NA, needle %in% haystacks)

zero2na = function(v)
    ifelse(v == 0, NA, v)

exer.light = exercise.freq(nlsy,
    "q11.genhlth.2a.1.2006", "q11.genhlth.2a.2.2006",
    "q11.genhlth.2c.1.2006", "q11.genhlth.2c.2.2006")$min.per.year
exer.vig = exercise.freq(nlsy,
    "q11.genhlth.1a.1.2006", "q11.genhlth.1a.2.2006",
    "q11.genhlth.1c.1.2006", "q11.genhlth.1c.2.2006")$min.per.year

nlsy.dvs = with(nlsy, list(

    # Fitness
    hard.to.run.mile = list(
        v = isin(h50fl.2.000001.xrnd,
            c("Very difficult/can't do", "Don't do"))),
    noteasy.to.climb.stairs = list(
        v = h50fl.2.000006.xrnd != "Not at all difficult for you"),
    bmi = list(
        v = 703.06958 * q11.9.2006
            / (q11.10.a.2006 * 12 + q11.10.b.2006)^2,
        trans = log),
    exer.light.ever = list(
        v = exer.light > 0),
    exer.light.min.per.year = list(
        v = zero2na(exer.light),
        trans = log,
        min = 0, max = 365*24*60),
    exer.vig.ever = list(
        v = exer.vig > 0),
    exer.vig.min.per.year = list(
        v = zero2na(exer.vig),
        trans = log,
        min = 0, max = 365*24*60),
    exer.str.ever = list(
        v = !(q11.genhlth.3a.1.2006 == 0 |
            q11.genhlth.3a.2.2006 == "Unable to do this activity")),

    # Healthy eating
    checks.nutrition.often = list(
        v = ifelse(q11.genhlth.7a.2006 == "DON'T BUY FOOD", NA,
            isin(q11.genhlth.7a.2006, qw(Always, Often)))),
#         v = ordered(levels = qw(Often, Sometimes, Rarely),
#             ifelse(q11.genhlth.7a.2006 == "DON'T BUY FOOD", NA,
#             ifelse(isin(q11.genhlth.7a.2006, qw(Always, Often)), "Often",
#             ifelse(isin(q11.genhlth.7a.2006, qw(Rarely, Never)), "Rarely",
#             "Sometimes"))))),
    ate.fast.food.pastweek = list(v =
        q11.genhlth.7c.1.2008 != 0),
    drank.caloric.soda.pastweek = list(v =
        q11.genhlth.7f.1.2008 != 0),

    # Sleep
    sleep.min.weekday = list(
        v = h50slp.1.xrnd*60 + h50slp.1b.xrnd,
        min = 0, max = 16*60),
    sleep.min.weekend = list(
        v = pmin(16*60, h50slp.2.xrnd*60 + h50slp.2b.xrnd),
        min = 0, max = 16*60),

    # Medicine
    health.insurance = list(
        v = q11.79.2006),
    flu.shot.20062008 = list(
        v = ifelse(sample.sex.1979 == "FEMALE",
            q11.genhlth.4c.f.000001.2008,
            q11.genhlth.4c.m.000001.2008)),
    saw.dentist.20062008 = list(
        v = ifelse(nlsy$sample.sex.1979 == "FEMALE",
            q11.genhlth.4e.f.000002.2008,
            q11.genhlth.4e.m.000001.2008)),
    brushes.teeth.twiceweekly = list(
       v = q11.genhlth.5a.2.2008 >= 2*7),
    flosses.weekly = list(
        v = q11.genhlth.5a.3.2008 >= 7),
#        v = ordered(levels = c("<weekly", "<daily", ">=daily"),
#         ifelse(q11.genhlth.5a.3.2008 == 0, "<weekly", 
#             ifelse(nlsy$q11.genhlth.5a.3.2008 < 7, "<daily", ">=daily")))),

    # Drug use
    smoked.100 = list(
        v = cig.chk1.2008 | ds.2.2008),
    smoking.in.2008 = list(
        v = (cig.chk1.2008 | ds.2.2008) &
            (cig.chk2.2008 | ds.3a.2008) &
            isin(ds.5.2008, qw(OCCASIONALLY, DAILY))),
    drank.pastmonth = list(
        v = q12.3.2006),
    drank.heavily.pastmonth = list(
        v = ifelse(!q12.3.2006, F,
           q12.4.2006 != "Never in the last 30 days")),
    drinks.last.month = list(
        v = num(ifelse(!q12.3.2006 | q12.5.2006 == 0 | q12.6.2006 == 0,
            NA,
            q12.5.2006 * q12.6.2006)),
        trans = log),
    used.cannabis.by.1998 = list(
        v = ds.8.1998 != "NEVER"),
    used.cocaine.by.1998 = list(
        v = ds.12.1998 != "NEVER"),
    hard.drugs.in.1980 = list(
        v = delin.13.1980 != "NEVER"),
    used.stims.by.1998 = list(
        v = ds.27.1998),

    # Sexual behavior
    age.first.sex = list(
        v = num(ifelse(sample.sex.1979 == "FEMALE",
          ifna(ffer.92.1983, ifna(ffer.92.1984, ffer.92.1985)),
          ifna(mfer.15.1983, ifna(mfer.15.1984, mfer.15.1985)))),
        min = 0),

    # Divorce
    ever.divorced = list(
        v = maprows(
            nlsy[,grep("^marstat\\.key", colnames(nlsy), val = T)],
            function(v)
               {v = tolower(char(v))
                any(grepl("divorce", v) | grepl("separ", v))})),

    # Crime
    stopped.by.police.1980 = list(
        v = police.1.1980),
    convicted.1980 = list(
        v = police.3.1980),

    # Income
    net.family.income = list(
      v = num(tnfi.trunc.2006), 
      trans = sqrt,
      min = 0,
      max = max(tnfi.trunc.2006, na.rm = T)),

    # Saving
    any.savings = list(
        v = q13.121.2000),
    retirement.account = list(
        v = q13.123.2000 | q13.123b.2000),

    # Debt
    missed.bill.payment = list(
        v = ps.1.2008),
    has.cc.debt = list(
        v = ifelse(debt.1.2008, debt.1a.2008 > 0, F)),
    cc.debt.amnt = list(
        v = num(ifelse(debt.1a.2008 > 0, debt.1a.2008, NA)),
        trans = log),
    maxed.out.cc = list(
        v = ps.2.2008 > 0),
    misc.business.debts = list(
        v = debt.3.2008),
    neg.net.worth = list(
        v = q13.141.2008 == "Be in debt")))

nlsy.dv.classes = sapply(nlsy.dvs,
    function(l) paste(collapse = " ", class(l$v)))

Now for some summaries and checks.

table(nlsy.dv.classes)
  count
logical 31
numeric 9
d = do.call(rbind, lapply(names(nlsy.dvs)[nlsy.dv.classes == "logical"],
    function(name) data.frame(row.names = name,
        n = sum(!is.na(nlsy.dvs[[name]]$v)),
        modal.rate = max(prop.table(table(nlsy.dvs[[name]]$v))),
        mode = mean(nlsy.dvs[[name]]$v, na.rm = T) > .5)))
rd(ordf(d, -modal.rate))
  n modal.rate mode
convicted.1980 6910 0.951 FALSE
maxed.out.cc 6775 0.887 FALSE
used.stims.by.1998 6711 0.887 FALSE
neg.net.worth 6759 0.885 FALSE
drank.heavily.pastmonth 7045 0.859 FALSE
hard.drugs.in.1980 6694 0.824 FALSE
stopped.by.police.1980 6907 0.824 FALSE
health.insurance 7123 0.810 TRUE
misc.business.debts 6811 0.796 FALSE
missed.bill.payment 6831 0.787 FALSE
used.cocaine.by.1998 6690 0.767 FALSE
brushes.teeth.twiceweekly 6551 0.741 TRUE
exer.light.ever 6789 0.738 TRUE
smoking.in.2008 6856 0.728 FALSE
any.savings 6618 0.716 TRUE
flu.shot.20062008 6852 0.681 FALSE
exer.vig.ever 6855 0.681 TRUE
saw.dentist.20062008 6856 0.668 TRUE
ate.fast.food.pastweek 6858 0.646 TRUE
exer.str.ever 7097 0.628 FALSE
used.cannabis.by.1998 6662 0.617 TRUE
noteasy.to.climb.stairs 4995 0.604 FALSE
flosses.weekly 6544 0.599 FALSE
drank.caloric.soda.pastweek 6850 0.579 TRUE
smoked.100 6855 0.575 TRUE
ever.divorced 7127 0.536 FALSE
has.cc.debt 6609 0.531 FALSE
drank.pastmonth 7052 0.527 TRUE
hard.to.run.mile 4932 0.524 FALSE
retirement.account 6599 0.523 TRUE
checks.nutrition.often 7053 0.514 FALSE
rd(d = 0, do.call(rbind, lapply(names(nlsy.dvs)[nlsy.dv.classes == "numeric"],
    function(name) data.frame(row.names = name,
        n = sum(!is.na(nlsy.dvs[[name]]$v)),
        t(quantile(nlsy.dvs[[name]]$v, c(0, .025, .25, .5, .75, .975, 1), na.rm = T))))))
  n X0. X2.5. X25. X50. X75. X97.5. X100.
bmi 6923 11 20 24 27 31 44 136
exer.light.min.per.year 5012 30 720 3120 7800 21900 183900 525600
exer.vig.min.per.year 4667 9 480 3650 7800 18720 175200 525600
sleep.min.weekday 4997 0 240 360 420 480 540 960
sleep.min.weekend 4994 0 240 360 420 480 600 960
drinks.last.month 3691 1 1 4 10 25 120 450
age.first.sex 6563 2 11 15 17 18 22 27
net.family.income 6748 0 0 27000 54975 90212 236932 479983
cc.debt.amnt 3100 3 200 1500 4000 10000 35000 139000
rd(d = 1, do.call(rbind, lapply(names(nlsy.dvs)[
        nlsy.dv.classes == "numeric" & sapply(nlsy.dvs, function(l) "trans" %in% names(l))],
    function(name) data.frame(row.names = name,
        n = sum(!is.na(nlsy.dvs[[name]]$v)),
        t(quantile(nlsy.dvs[[name]]$trans(nlsy.dvs[[name]]$v), c(0, .025, .25, .5, .75, .975, 1), na.rm = T))))))
  n X0. X2.5. X25. X50. X75. X97.5. X100.
bmi 6923 2.4 3.0 3.2 3.3 3.4 3.8 4.9
exer.light.min.per.year 5012 3.4 6.6 8.0 9.0 10.0 12.1 13.2
exer.vig.min.per.year 4667 2.2 6.2 8.2 9.0 9.8 12.1 13.2
drinks.last.month 3691 0.0 0.0 1.4 2.3 3.2 4.8 6.1
net.family.income 6748 0.0 0.0 164.3 234.5 300.4 486.8 692.8
cc.debt.amnt 3100 1.1 5.3 7.3 8.3 9.2 10.5 11.8

Prediction

Model terms

My first batch of models had only one term each for the two patience items and an interaction. My second will have an additional two dummy variables indicating whether subjects answers 0 on each impatience items, and all possible, identifiable interactions. Starting with the 24 = 16 possible terms, calling the dummy variables d1, d2 and the continuous variables c1, c2:

combs = unlist(rec = F, lapply(0:4, function(i)
    combn(qw(d1, d2, c1, c2), i, simp = F)))
sapply(combs, function(v) paste(v, collapse = " "))
  x
1  
2 d1
3 d2
4 c1
5 c2
6 d1 d2
7 d1 c1
8 d1 c2
9 d2 c1
10 d2 c2
11 c1 c2
12 d1 d2 c1
13 d1 d2 c2
14 d1 c1 c2
15 d2 c1 c2
16 d1 d2 c1 c2

we remove all those that contain both d1 and c1, or d2 and c2, to get 9 terms:

combs = combs[sapply(combs, function(l) !(
    ("d1" %in% l && "c1" %in% l) || 
    ("d2" %in% l && "c2" %in% l)))]
sapply(combs, function(v) paste(v, collapse = " "))
  x
1  
2 d1
3 d2
4 c1
5 c2
6 d1 d2
7 d1 c2
8 d2 c1
9 c1 c2

Binary DVs

nlsy.predtable.dichotomous = function(iv, formula, brier = F)
    do.call(rbind, lapply((names(nlsy.dvs)[nlsy.dv.classes == "logical"]), function(name)
       {message(name)
        y = nlsy.dvs[[name]]$v
        s = row.names(iv)[which(!is.na(y))]
        y = y[!is.na(y)]
        X = iv[s,]
        judge = function(p, y)
          {if (brier) mean((p - y)^2)
           else       mean((p >= .5) == y)}
        data.frame(row.names = name,
            assoc = local(
               {inpred = predict(glm(formula, cbind(X, y),
                    family = binomial(link = "logit")), type = "response")
                judge(inpred, y)}),
            base.rate = judge(mean(y), y),
            logit = local(
               {pred = crossvalid(X, y, nfold = 10, f = function(train.iv, train.dv, test.iv)
                   {model = glm(formula, cbind(train.iv, y = train.dv),
                        family = binomial(link = "logit"))
                    predict(model, type = "response", newdata = test.iv)})
                judge(pred, y)}),
            rf = (if (brier) NA else local(
               {rf = randomForest(x = X, y = factor(y),
                    ntree = 2000,
                    mtry = ncol(X)) # Since there are only a few predictors.
                1 - tail(rf$err.rate, 1)[1]})))}))

nominal.pred.dichotomous = function(train.iv, train.dv, test.iv)
   {p = tapply(train.dv, train.iv, mean)
    ifna(p[char(test.iv)], mean(train.dv))}

nlsy.predtable.dichotomous.nominaliv = function(iv, brier = F)
    do.call(rbind, lapply((names(nlsy.dvs)[nlsy.dv.classes == "logical"]), function(name)
       {y = nlsy.dvs[[name]]$v
        s = row.names(iv)[which(!is.na(y))]
        y = y[!is.na(y)]
        x = paste(iv[s, "month"], iv[s, "year"])
        # Combine all singleton classes.
        singleton = table(x) == 1
        x = factor(ifelse(singleton[x], "singleton", x))
        judge = function(p, y)
          {if (brier) mean((p - y)^2)
           else       mean((p >= .5) == y)}
        data.frame(row.names = name,
            assoc = judge(nominal.pred.dichotomous(x, y, x), y),
            base.rate = judge(mean(y), y),
            pred = local({
                pred = crossvalid(x, y, nfold = 10, f = nominal.pred.dichotomous)
                judge(pred, y)}))}))

iv.lin = nlsy[,qw(impatience.1.2006, impatience.2.2006)]
colnames(iv.lin) = qw(month, year)
iv.log = ss(select = -c(month, year), transform(iv.lin,
    month0 = month == 0,
    log.month = ifelse(month == 0, 0, log(month)),
    year0 = year == 0,
    log.year = ifelse(year == 0, 0, log(year))))
rd(d = 2, cached(nlsy.predtable.dichotomous(iv.lin,
    y ~ month * year)))
  assoc base.rate logit rf
hard.to.run.mile 0.52 0.52 0.52 0.50
noteasy.to.climb.stairs 0.60 0.60 0.60 0.58
exer.light.ever 0.74 0.74 0.74 0.72
exer.vig.ever 0.68 0.68 0.68 0.67
exer.str.ever 0.63 0.63 0.63 0.60
checks.nutrition.often 0.51 0.51 0.51 0.52
ate.fast.food.pastweek 0.65 0.65 0.65 0.62
drank.caloric.soda.pastweek 0.58 0.58 0.58 0.57
health.insurance 0.81 0.81 0.81 0.80
flu.shot.20062008 0.68 0.68 0.68 0.67
saw.dentist.20062008 0.67 0.67 0.67 0.64
brushes.teeth.twiceweekly 0.74 0.74 0.74 0.73
flosses.weekly 0.60 0.60 0.60 0.58
smoked.100 0.58 0.58 0.58 0.55
smoking.in.2008 0.73 0.73 0.73 0.72
drank.pastmonth 0.53 0.53 0.53 0.55
drank.heavily.pastmonth 0.86 0.86 0.86 0.85
used.cannabis.by.1998 0.62 0.62 0.62 0.61
used.cocaine.by.1998 0.77 0.77 0.77 0.76
hard.drugs.in.1980 0.82 0.82 0.82 0.81
used.stims.by.1998 0.89 0.89 0.89 0.88
ever.divorced 0.54 0.54 0.54 0.54
stopped.by.police.1980 0.82 0.82 0.82 0.82
convicted.1980 0.95 0.95 0.95 0.95
any.savings 0.72 0.72 0.72 0.70
retirement.account 0.53 0.52 0.53 0.56
missed.bill.payment 0.79 0.79 0.79 0.77
has.cc.debt 0.53 0.53 0.53 0.52
maxed.out.cc 0.89 0.89 0.89 0.88
misc.business.debts 0.80 0.80 0.80 0.78
neg.net.worth 0.89 0.89 0.89 0.88

With logged predictors and dummy variables for 0 responses:

rd(d = 6, cached(nlsy.predtable.dichotomous(iv.log,
    y ~ month0 + log.month + year0 + log.year +
        month0:year0 + month0:log.year + year0:log.month + log.month:log.year)))
  assoc base.rate logit rf
hard.to.run.mile 0.537510 0.523723 0.524736 0.493309
noteasy.to.climb.stairs 0.603003 0.603804 0.601602 0.584985
exer.light.ever 0.738253 0.738253 0.737958 0.719399
exer.vig.ever 0.682130 0.680817 0.681400 0.665500
exer.str.ever 0.626744 0.627730 0.625194 0.599690
checks.nutrition.often 0.539345 0.513682 0.539061 0.528569
ate.fast.food.pastweek 0.646398 0.646398 0.645815 0.623360
drank.caloric.soda.pastweek 0.580146 0.579124 0.581606 0.573723
health.insurance 0.809912 0.810192 0.809912 0.802752
flu.shot.20062008 0.681261 0.681261 0.680969 0.664915
saw.dentist.20062008 0.670216 0.668466 0.668174 0.646879
brushes.teeth.twiceweekly 0.740650 0.740650 0.740650 0.724622
flosses.weekly 0.598869 0.598869 0.598258 0.577323
smoked.100 0.575638 0.575492 0.574034 0.550109
smoking.in.2008 0.727246 0.727684 0.727538 0.717182
drank.pastmonth 0.541123 0.526659 0.537294 0.549773
drank.heavily.pastmonth 0.858907 0.858907 0.858907 0.854081
used.cannabis.by.1998 0.616482 0.617232 0.616031 0.607175
used.cocaine.by.1998 0.767115 0.767115 0.767115 0.755904
hard.drugs.in.1980 0.823723 0.823723 0.823723 0.814909
used.stims.by.1998 0.887200 0.887200 0.887200 0.883922
ever.divorced 0.536972 0.536271 0.535008 0.541602
stopped.by.police.1980 0.823657 0.823657 0.823657 0.815694
convicted.1980 0.951230 0.951230 0.951230 0.948915
any.savings 0.715473 0.715624 0.714264 0.704140
retirement.account 0.571450 0.523413 0.570996 0.553569
missed.bill.payment 0.787440 0.787440 0.787440 0.772069
has.cc.debt 0.541232 0.530943 0.540778 0.526555
maxed.out.cc 0.887380 0.887380 0.887380 0.882952
misc.business.debts 0.796212 0.796212 0.796065 0.775510
neg.net.worth 0.885042 0.885190 0.885042 0.879124

With Brier scores:

rd(d = 5, cached(nlsy.predtable.dichotomous(iv.log, brier = T,
    y ~ month0 + log.month + year0 + log.year +
        month0:year0 + month0:log.year + year0:log.month + log.month:log.year)))

[Needs to be regenerated.]

Generally, the logistic regression model achieves predictive accuracy very slightly above the base rate, although not as high as the association, which still isn't high.

Back to percent agreement, but with the IVs fully discretized.

rd(d = 2, nlsy.predtable.dichotomous.nominaliv(iv.lin))
  assoc base.rate pred
hard.to.run.mile 0.57 0.52 0.49
noteasy.to.climb.stairs 0.63 0.60 0.58
exer.light.ever 0.75 0.74 0.73
exer.vig.ever 0.69 0.68 0.67
exer.str.ever 0.65 0.63 0.61
checks.nutrition.often 0.58 0.51 0.52
ate.fast.food.pastweek 0.66 0.65 0.63
drank.caloric.soda.pastweek 0.61 0.58 0.58
health.insurance 0.81 0.81 0.80
flu.shot.20062008 0.69 0.68 0.66
saw.dentist.20062008 0.68 0.67 0.65
brushes.teeth.twiceweekly 0.75 0.74 0.73
flosses.weekly 0.62 0.60 0.57
smoked.100 0.60 0.58 0.56
smoking.in.2008 0.74 0.73 0.71
drank.pastmonth 0.59 0.53 0.54
drank.heavily.pastmonth 0.86 0.86 0.85
used.cannabis.by.1998 0.64 0.62 0.60
used.cocaine.by.1998 0.77 0.77 0.75
hard.drugs.in.1980 0.83 0.82 0.81
used.stims.by.1998 0.89 0.89 0.88
ever.divorced 0.58 0.54 0.53
stopped.by.police.1980 0.83 0.82 0.81
convicted.1980 0.95 0.95 0.95
any.savings 0.73 0.72 0.71
retirement.account 0.61 0.52 0.56
missed.bill.payment 0.79 0.79 0.77
has.cc.debt 0.59 0.53 0.54
maxed.out.cc 0.89 0.89 0.88
misc.business.debts 0.80 0.80 0.78
neg.net.worth 0.89 0.89 0.88

The model generally overfits, even though the increase from the base rate to association is still small.

Continuous DVs

nlsy.predtable.cont = function(iv, formula)
    do.call(rbind, lapply(names(nlsy.dvs)[nlsy.dv.classes == "numeric"], function(name)
       {message(name)
        l = nlsy.dvs[[name]]
        y = l$v
        s = row.names(iv)[which(!is.na(y))]
        y = y[!is.na(y)]
        X = iv[s,]
        trans = identity
        untrans = identity
        if ("trans" %in% names(l))
           {trans = l$trans
            if (identical(trans, log))
                untrans = exp
            else if (identical(trans, sqrt))
                untrans = function(v) v^2
            else
                stop()}
        y.min = (if ("min" %in% names(l)) l$min
            else -Inf)
        y.max = (if ("max" %in% names(l)) l$max
            else Inf)
        digest.pred = function(y.pred)
            pmin(y.max, pmax(y.min, untrans(y.pred)))
        data.frame(row.names = name,
            cor.m = cor(X[,(if ("log.month" %in% colnames(X)) "log.month" else "month")], y),
            cor.y = cor(X[,(if ("log.year" %in% colnames(X)) "log.year" else "year")], y),
            assoc = local(
               {inpred = digest.pred(predict(
                    rq(formula, cbind(X, y = trans(y)),
                    tau = .5)))
                mean(abs(inpred - y))}),
            mad = mean(abs(y - median(y))),
            qr = local(
               {pred = digest.pred(crossvalid(X, trans(y), nfold = 10, f = function(train.iv, train.dv, test.iv)
                   {model = rq(formula, cbind(train.iv, y = train.dv),
                        tau = .5)
                    predict(model, newdata = test.iv)}))
                mean(abs(pred - y))}),
            rf = local(
               {rf = quantregForest(x = X, y = trans(y),
                    mtry = ncol(X)) # Since there are only a few predictors.
                pred = digest.pred(predict(rf, quantiles = .5))
                mean(abs(pred - y))}))}))

nominal.pred.cont = function(train.iv, train.dv, test.iv)
   {p = tapply(train.dv, train.iv, median)
    ifna(p[char(test.iv)], median(train.dv))}

nlsy.predtable.cont.nominaliv = function(iv)
    do.call(rbind, lapply(names(nlsy.dvs)[nlsy.dv.classes == "numeric"], function(name)
       {l = nlsy.dvs[[name]]
        y = l$v
        s = row.names(iv)[which(!is.na(y))]
        y = y[!is.na(y)]
        x = paste(iv[s, "month"], iv[s, "year"])
        # Combine all singleton classes.
        singleton = table(x) == 1
        x = factor(ifelse(singleton[x], "singleton", x))
        trans = identity
        untrans = identity
        if ("trans" %in% names(l))
           {trans = l$trans
            if (identical(trans, log))
                untrans = exp
            else if (identical(trans, sqrt))
                untrans = function(v) v^2
            else
                stop()}
        y.min = (if ("min" %in% names(l)) l$min
            else -Inf)
        y.max = (if ("max" %in% names(l)) l$max
            else Inf)
        digest.pred = function(y.pred)
            pmin(y.max, pmax(y.min, untrans(y.pred)))
        data.frame(row.names = name,
            assoc = mean(abs(digest.pred(nominal.pred.cont(x, trans(y), x)) - y)),
            mad = mean(abs(y - median(y))),
            pred = local({
                pred = digest.pred(crossvalid(x, trans(y), nfold = 10, f = nominal.pred.cont))
                mean(abs(pred - y))}))}))
rd(d = 2, cached(nlsy.predtable.cont(iv.lin,
    y ~ month * year)))
  cor.m cor.y assoc mad qr rf
bmi 0.00 0.01 4.42 4.43 4.43 4.64
exer.light.min.per.year -0.01 -0.01 25938.93 25940.94 25948.50 26780.61
exer.vig.min.per.year 0.00 0.01 22186.08 22205.79 22238.68 22968.64
sleep.min.weekday 0.01 0.02 63.29 63.32 63.37 66.40
sleep.min.weekend -0.02 0.00 73.62 73.66 73.96 75.88
drinks.last.month 0.00 -0.01 17.92 17.92 18.12 18.83
age.first.sex -0.01 -0.01 1.91 1.91 1.91 1.97
net.family.income -0.01 -0.01 44124.58 44145.32 44206.65 44475.98
cc.debt.amnt -0.02 -0.01 5969.76 5972.24 5973.77 6299.73

The correlations are remarkably close to 0.

With logged predictors and dummy variables for 0 responses:

rd(d = 2, cached(nlsy.predtable.cont(iv.log,
    y ~ month0 + log.month + year0 + log.year +
        month0:year0 + month0:log.year + year0:log.month + log.month:log.year)))
  cor.m cor.y assoc mad qr rf
bmi 0.04 0.03 4.41 4.43 4.41 4.65
exer.light.min.per.year -0.01 -0.03 25921.10 25940.94 25952.80 26637.90
exer.vig.min.per.year -0.01 -0.02 22173.85 22205.79 22196.01 22958.10
sleep.min.weekday 0.01 0.02 63.18 63.32 63.47 66.99
sleep.min.weekend 0.04 0.04 73.18 73.66 73.36 76.47
drinks.last.month -0.04 -0.03 17.91 17.92 17.94 18.85
age.first.sex -0.05 -0.03 1.91 1.91 1.92 1.98
net.family.income -0.07 -0.03 43151.85 44145.32 43242.16 44292.69
cc.debt.amnt -0.04 -0.01 5961.44 5972.24 5984.10 6257.58
rd(d = 2, nlsy.predtable.cont.nominaliv(iv.lin))
  assoc mad pred
bmi 4.26 4.43 4.56
exer.light.min.per.year 25427.31 25940.94 26801.83
exer.vig.min.per.year 21730.30 22205.79 23838.04
sleep.min.weekday 60.37 63.32 66.47
sleep.min.weekend 69.09 73.66 76.68
drinks.last.month 17.34 17.92 18.87
age.first.sex 1.83 1.91 1.99
net.family.income 40976.40 44145.32 44212.97
cc.debt.amnt 5715.25 5972.24 6254.13

The model generally overfits, even though the increase from the base rate to association is still small.

p-values

Dichotomous criterion variables:

ps = t(sapply(nlsy.dvs[nlsy.dv.classes == "logical"], function(l) c(
    month = wilcox.test(iv.lin$month ~ l$v)$p.value,
    year = wilcox.test(iv.lin$year ~ l$v)$p.value)))
c(
   "comparisons" = length(ps),
   "p < .05"     = sum(ps < .05),
   "p < .05/comp" = sum(ps < .05/length(ps)))
  value
comparisons 62
p < .05 45
p < .05/comp 30

Continuous criterion variables:

ps = t(sapply(nlsy.dvs[nlsy.dv.classes == "numeric"], function(l) c(
    month = cor.test(iv.lin$month, l$v, method = "kendall")$p.value,
    year = cor.test(iv.lin$year, l$v, method = "kendall")$p.value)))
c(
   "comparisons" = length(ps),
   "p < .05"     = sum(ps < .05),
   "p < .05/comp" = sum(ps < .05/length(ps)))
  value
comparisons 18
p < .05 7
p < .05/comp 7

References

Beck, R. C., & Triplett, M. F. (2009). Test–retest reliability of a group-administered paper–pencil measure of delay discounting. Experimental and Clinical Psychopharmacology, 17, 345–355. doi:10.1037/a0017078

Chabris, C. F., Laibson, D., Morris, C. L., Schuldt, J. P., & Taubinsky, D. (2008). Individual laboratory-measured discount rates predict field behavior. Journal of Risk and Uncertainty, 37(2, 3), 237–269. doi:10.1007/s11166-008-9053-x

de Gruijter, D. N. M., & van der Kamp, L. J. T. (2008). Statistical test theory for the behavioral sciences. Boca Raton, FL: CRC. ISBN 978-1-58488-958-8.

Harinck, F., Van Dijk, E., Van Beest, I., & Mersmann, P. (2007). When gains loom larger than losses: Reversed loss aversion for small amounts of money. Psychological Science, 18(12), 1099–1105. doi:10.1111/j.1467-9280.2007.02031.x

Horstein, M. (1963). Sequential transmission using noiseless feedback. IEEE Transactions on Information Theory, 9(3), 136–143. doi:10.1109/TIT.1963.1057832

Kirby, K. N. (2009). One-year temporal stability of delay-discount rates. Psychonomic Bulletin and Review, 16(3), 457–462. doi:10.3758/PBR.16.3.457

Kirby, K. N., Petry, N. M., & Bickel, W. K. (1999). Heroin addicts have higher discount rates for delayed rewards than non-drug-using controls. Journal of Experimental Psychology: General, 128(1), 78–87. doi:10.1037/0096-3445.128.1.78

Madden, G. J., Petry, N. M., Badger, G. J., & Bickel, W. K. (1997). Impulsive and self-control choices in opioid-dependent patients and non-drug-using control participants: Drug and monetary rewards. Experimental and Clinical Psychopharmacology, 5(3), 256–262. doi:10.1037/1064-1297.5.3.256

Sayman, S., & Öncüler, A. (2009). An investigation of time inconsistency. Management Science, 55(3), 470–482. doi:10.1287/mnsc.1080.0942

Schuerger, J. M., Tait, E., & Tavernelli, M. (1982). Temporal stability of personality by questionnaire. Journal of Personality and Social Psychology, 43(1), 176–182. doi:10.1037/0022-3514.43.1.176

Smith, C. L., & Hantula, D. A. (2008). Methodological considerations in the study of delay discounting in intertemporal choice: A comparison of tasks and modes. Behavior Research Methods, 40(4), 940–953. doi:10.3758/BRM.40.4.940

Sutter, M., Kochaer, M. G., Rützler, D., & Trautmann, S. T. (2010). Impatience and uncertainty: Experimental decisions predict adolescents' field behavior (Discussion Paper No. 5404). Institute for the Study of Labor. Retrieved from http://ftp.iza.org/dp5404.pdf

van der Linden, W. J., & Glas, C. A. W. (Eds.). (2010). Elements of adaptive testing. New York, NY: Springer. ISBN 978-0-387-85459-5.

Waeber, R. (2013). Probabilistic bisection search for stochastic root-finding (PhD thesis). Cornell University. Retrieved from http://people.orie.cornell.edu/shane/theses/ThesisRolfWaeber.pdf

Weber, B. J., & Chapman, G. B. (2005). Playing for peanuts: Why is risk seeking more common for low-stakes gambles? Organizational Behavior and Human Decision Processes, 97(1), 31–46. doi:10.1016/j.obhdp.2005.03.001

Wulfert, E., Block, J. A., Ana, E. S., Rodriguez, M. L., & Colsman, M. (2002). Delay of gratification: Impulsive choices and problem behaviors in early and late adolescence. Journal of Personality, 70(4), 533–552. doi:10.1111/1467-6494.05013