About the projects page

Created 11 Aug 2012 • Last modified 22 Jan 2023

Being a fan of open science, reproducible research, free software, and transparency in government spending, I make a point of freely releasing all the results of my scientific work… to the extent I'm allowed by coauthors and the involved organizations. I include raw data, task code, analysis code, slideshows, manuscripts, and my own notes. Since organizing all this material isn't always easy (or worth doing as well as it could be done—I'm not getting any publications out of this, you know!), there can be long delays between when I make something and post it here, and a lot of rough edges in what I actually release. But in many cases, I've arranged for stuff to be uploaded immediately after I produce it, in the spirit of open-notebook science. And I'd be happy to help you use, find, or interpret materials, regardless of whether I'm the official corresponding author of a paper.

On my research-projects page, I've grouped materials by project. A project, as I use the term, is a nebulous entity that may comprise any number of distinct studies, and conversely, research themes and actual publications can span multiple projects. In practice, what binds a project together is the analysis code and, to a lesser degree, the task code. I make no excuses for my colorful project codenames. Some of the less colorful ones were chosen by coauthors. In addition to "source packages", which contain data, code, and documents, projects may also have HTML notebooks and papers (usually written with Daylight), PDFs, and GitHub repositories. Source material for the documents can generally be found in the corresponding source package. Code hosted on GitHub is not also included in source packages. Notebooks are updated most frequently, but they're inconsistently intelligible and I don't usually revise them: they are, after all, notebooks. Some of the analysis code may be found as comments or code blocks in the source documents of manuscripts.

Mathematical notation in HTML pages is represented with MathML. Browser support for MathML can be spotty. I can at least verify that Firefox implements it well enough for my purposes, so if my equations look funny to you, try Firefox.

In general, I've licensed data under the ODbL, code under the GNU GPL, and everything else under CC-BY-SA. The idea is to copyleft everything as thoroughly as possible.

About the data

All data has been deidentified.

You'll notice that data is often missing for projects for which I didn't collect the data, or for which the data is restricted by law (e.g., HIPAA), organizational rules (e.g., IRB decisions), or ethical concerns. In these cases, there's generally no way for me to get permission to make the data freely accessible to all. But if you're interested in the data, get in touch with me. If there are no ethical issues, or you can convince me that you'll handle such issues appropriately, I'll do my best to help you convince the powers that be to cough up the data for your use.

If you're puzzling over how some data seems not to match the version of the task code that produced it, note that I sometimes, for the sake of convenience, update the structure of data to match later versions of the task code. Hopefully, I don't introduce errors in the process.

(Yes, Virginia, "data" is a mass noun.)