International ReCal Users Take Note:
I recently fixed a bug that had been preventing certain users with non-Latin character sets from executing ReCal. So if your computer uses a non-European language and ReCal has failed you over the past month or so, it might work for you now. As always, please don’t hesitate to let me know about any technical problems you may encounter.
ReCal: The Fine Print
As of Tuesday, November 11, 2008, I have begun collecting all data files uploaded through ReCal in order to improve the application. (Prior to today, all data was discarded as soon as ReCal ran its calculations.) By using ReCal you agree to license your data files to me for this purpose and for no other. Since your data files consist entirely of numbers meaningful only to you, it is completely anonymous—this is one of the reasons ReCal does not require text data headers. Google, Yahoo, Microsoft, and other providers of online services use your personal data in similar ways—but I don’t sell your files (not that there’s a market for CSV files full of unlabeled numbers anyway). If you have any questions about how I use ReCal user data, please leave them in comments below.
Protected: AU Charts set 3 (updated 11/13)
Protected: More AU charts (new chart at bottom)
Protected: MG/AU Charts, set 1 (now with adjustments!)
ReCal error log entry #1: invariant values
The purpose of this post is to explain the following ReCal error:
*Scott’s pi/Cohen’s kappa/Fleiss’ kappa could not be calculated for this variable due to invariant values.
You should only see this error when two conditions apply simultaneously to your data: 1) all of your coders have attained 100% agreement and 2) they have all selected the same variable value for every case. (If you see it under any other circumstances, please let me know, as it means the code is flawed and needs to be fixed.) For example, assume a five-case reliability sample of a binary variable with possible values 1 and 0. If both coders decide that all five cases should be rated 0 or that all five cases should be rated 1, the “invariant values” scenario, or IVS (I’m sure someone’s come up with a better name for it) occurs. Scott’s pi, Cohen’s kappa, and Fleiss’ Kappa are all undefined when this happens (Fleiss’ kappa is slightly more robust in that the more coders in the reliability pool, the less likely they all are to choose the same value for every case).
The reason for this is that when the two IVS conditions obtain, the mathematical definition of expected agreement for these coefficients is 1. Let’s take a look at the example specified in the previous paragraph:
| Coder1 | Coder2 |
| 1 | 1 |
| 1 | 1 |
| 1 | 1 |
| 1 | 1 |
| 1 | 1 |
As you can see, the IVS is in effect because all values for this variable are equal to 1. Percent agreement in this case is obviously 100%; observed agreement is 1. The number of 1s for coders 1 and 2 is 5 for both, for a total of 10 decisions. The first, and only, joint marginal proportion for Scott’s pi is equal to (5 + 5) / 10 = 1. Expected agreement then becomes 12 = 1. The Scott’s pi equation would thus be:
(observed - expected) / (1 - expected) = (1 - 1) / (1 - 1)
But this leads to division by zero, which basic arithmetic tells us is undefined. Thus, Scott’s pi (and Cohen’s kappa, which behaves similarly) are undefined under the IVS. Fleiss’ kappa is similarly nonexistent when all coders assign the same value to all cases.
Krippendorff’s alpha, on the other hand, is immune to this problem. Recall its basic form:
a = 1 - Do/De
When observed disagreement (Do) is 0, Do/De simplifies to 0, and a equals 1. This is one instance in which Krippendorff’s alpha improves upon its predecessors.
In Copenhagen
Whew . . . just made it through my AOIR presentation relatively unscathed. I think the crowd was a bit cool to my subject area—I only got one comment at the end, a compliment, and from the president of AOIR no less. Still, I was hoping folks would grill me a bit more, but then I suppose there’s always a chance someone might have something to say after all the session presentations are done. The panel is good so far, but not thematically unified—maybe the audience wasn’t familiar with the general research area. I really should try and get on a proposed panel for my next conference; they seem to be draw folks with some level of topic expertise.
Google charts r teh awesoem
I’m probably the last quantitative person on the net to discover the magic of Google Charts. The API is completely URL-driven, so all you have to do is fill in the blanks to get presentation-quality charts like this (from an ongoing project and completely devoid of context):
The URL for the chart above is here (line breaks inserted for readability):
http://chart.apis.google.com/chart?cht=p&chd=t:28,23,18,8,7,7,7,2
&chs=450x200&chl=DC%20Action%20(28%)|DC%20Knowledge%20(23%)
|DC%20Join%20(18%)|AC%20Expr%20(8%)|AC%20Knowledge%20(7%)
|DC%20Expr%20(7%)|AC%20Join%20(7%)|AC%20Take%20Action%20%20(2%)
&chco=ff0000,00ff00,0000ff
The developer’s guide is ridiculously easy to follow. For me at least, it really takes a lot of the headache out of whipping up data visualizations. This is definitely my second-favorite new toy of 2008.
How to Talk About Race
My scholarly work rarely touches on questions of race, ethnicity, or identity, but as an African-American male, my interest in these topics is quite personal. When Obama identified a need for more good-faith dialogue on race in this country in his acclaimed “A More Perfect Union” speech last March, I fully endorsed the sentiment but found myself at a loss to conceptualize the nature of such an exchange in light of the off-putting toxicity of much American racial discourse. Recently, however, I found a high-profile online forum that answers Obama’s call about as well as I could have imagined: Ta-Nehisi Coates’ blog on the Atlantic Monthly’s web site.
Coates takes great pains to curate a respectful, honest comment section—a task most of his Atlantic colleagues have abandoned—and his efforts pay off in deliberative spades. The blog attracts a diverse range of readers if the commenters are any indication, and the discussion for the most part achieves a refreshing frankness while avoiding awkward euphemism. As a social scientist, I find myself instinctually drawn to the scholarly frame of cause and effect: what makes Coates’ blog different from other discussion sites in which racial topics are not debated with nearly as much charity? I have my suspicions, but would prefer to answer the question with data. Perhaps a comparative study of a sample of race-blogs categorized along theoretically relevant lines? Might be a good excuse to acquaint myself with a few new literatures . . .
Welcome to dfreelon.org
One day I’ll think of something spiffy to place in this space—perhaps a real blog, or a spectacularly pithy statement of purpose. For now, use the white links above to navigate through the site. If you’re looking for ReCal, you’ve found it. There’s a smattering of other content in the other sections, which I intend to flesh out fully throughout the next few months. So stay tuned . . .
