ReCal FAQ and Troubleshooting Page

Have questions about ReCal, the online intercoder reliability calculator? Hopefully this page contains the answer you’re looking for. If not, feel free to submit new questions in comments. Back to ReCal main page

ReCal isn’t working with my data. It keeps giving me the following error code:

What applications is ReCal compatible with?

How can I be sure ReCal’s results are accurate?

ReCal reports high percentage agreements but low Scott’s pi/ Cohen’s kappa/ Krippendorff’s alpha for my data. What is going on?

I would prefer to use an Excel spreadsheet to calculate intercoder reliability. Why should I use ReCal?

I need help understanding/interpreting/improving my results; what resources are available to me?

How do I create CSV files?

What are CSV files and why does ReCal require them?

Why doesn’t ReCal work with Excel/SPSS/Word/[insert other proprietary software package here] files?

My data cannot easily be reformatted to conform to ReCal’s specifications. What alternatives are available?

What happens to my data when I submit it to ReCal?

Is there a version of ReCal forthcoming that will perform similar analyses on ordinal and interval data and that will accept missing data?

What are the functional differences between ReCal2 and ReCal3?

Who are you?


 

Error 1—You should never see this error, at all. Ever. If you do, please let me know ASAP because it indicates a dire system error.

Error 2—This error occurs in two cases: first, when your file runs above 100,000 bytes in size; and second, when your file is 0 bytes in size. It helps prevent incorrect and corrupt file types from being processed (CSV files are rarely that large and never that small). Double-check your file and make sure it is indeed a non-corrupt CSV file.

Error 3—This error occurs when your data file contains characters other than numeric digits (with the exception of alphabetic letters on the first row). ReCal’s requirements in this regard are generally quite strict—the digit “1” by itself would pass muster, whereas “1.00” would not due to the decimal point. Similarly, negative numbers won’t work in ReCal due to the minus sign—you’ll need to convert them to positive numbers. The only exception is that header text for each column may be included on the first row (a la SPSS); in this case, the entire first row will be ignored and calculation will begin on the second row. Make sure to scour your entire file for any characters other than numeric digits except on the first row.

Some users might see this error even when they are absolutely certain that their file contains only numbers. The problem in these cases may be that the “CSV” file is delimited by a character other than commas or semicolons. (See What are CSV files and why does ReCal use them? if you don’t know what this means.) To determine whether this is the problem, open your file in a basic text editor (not MS Word) such as Wordpad in Windows or TextEdit in Mac. If you see a series of numbers separated by anything other than commas or semicolons, you will need to run a Find/Replace command to convert whatever the separating character is into commas.

Error 4—This error is caused by missing data in your file, commonly seen in spreadsheet software as blank cells surrounded by data. Missing data violates the assumptions of all the coefficients ReCal computes; therefore it is not accepted. Fill in the missing data on the line indicated or delete the line entirely and try again.

Error 5—This error occurs for ReCal2 only and indicates an odd number of columns, whereas ReCal2 requires an even number of columns. Recall that ReCal2 assumes that each pair of columns constitutes two coder’s judgments on a single variable. If the number of columns in the data file is odd, the final column has no corresponding column with which reliability can be calculated. Double-check the number of columns in your file and try again.

Error 7—ReCal requires that each file submitted to it feature a “.csv” extension at the end of the filename. It is critical to understand that a file cannot be converted to CSV format simply by changing its extension! See How do I create CSV files? for more details on this point.

Error 8—This error occurs when all the rows in your file do not contain the same number of codes. For example, if rows 1-10 of a hypothetical 20-row file contain three columns of data, and rows 11-20 contain four, the file will trigger error 8. ReCal would return incorrect results if it attempted to analyze it. There are two ways to solve this problem: the first would be to delete the 4th column from rows 11-20, leaving 20 rows with 3 columns each; this solution only works for ReCal3. The second is to add the missing data to rows 1-10, creating 20 rows of 4 columns each which could be analyzed in ReCal2 or ReCal3 depending on the nature of the data.

What applications is ReCal compatible with? ReCal can read data from any software application that has the ability to “Save As” or “Export” files in CSV format. (More on CSV and why ReCal uses it here.) This includes SPSS, Stata, S-PLUS, SAS, Excel, Google Docs, MS Access, OpenOffice/NeoOffice, Minitab (as of version 15), and more.

How can I be sure ReCal’s results are accurate? Unfortunately, there is no way I can show a priori that ReCal will furnish accurate results for all possible datasets. 100% certainty is only possible in certain branches of pure mathematics; in the real world all sorts of things can go wrong. With ReCal, for example, very rarely people will format their data incorrectly yet manage to see results anyway, which of course will be incorrect. Barring this scenario, I would certainly encourage you to test ReCal against other reliability calculators, especially if its results appear flawed. A list of alternative reliability calculators can be found here.

I would prefer to use an Excel spreadsheet to calculate intercoder reliability. Why should I use ReCal? Here are several reasons why ReCal is superior to spreadsheets for reliability calculation:

  1. From a programming standpoint, Excel’s basic function language is not very sophisticated in its handling of arrays, which are essential for calculating reliability. This means that VBA (Visual Basic for Applications, a Microsoft proprietary programming language that works inside Excel spreadsheets) would need to be used. Unfortunately, MS has removed VBA from the current Mac version of Office, meaning that an ICRC macro written in VBA would be useless for non-Windows users.
  2. Even if VBA hadn’t been removed from Office for Mac, a VBA macro would still restrict ReCal usage to Excel users. ReCal doesn’t require anything other than a web browser and an internet connection.
  3. PHP generally runs faster than VBA, which you’ll notice if you compare ReCal to PRAM (although technically PRAM is written in VB rather than VBA, the languages are very closely related as their names indicate).
  4. I don’t know of any publicly available spreadsheets that calculate Scott’s Pi, Cohen’s Kappa, or Krippendorff’s Alpha. (In fact, this was the main reason I created ReCal in the first place.) If you know of any, please let me know and I’ll link to them.

I need help understanding/interpreting/improving my results; what resources are available to me? Probably the best intercoder reliability resource on the web is Matthew Lombard’s site, which presents the basics of how to calculate, use, and interpret reliability statistics. Beyond that, you may be interested in the extended discussions found in Content Analysis: an Introduction to Its Methodology by Klaus Krippendorff and/or The Content Analysis Guidebook by Kim Neuendorf. Finally, if you have a question of general interest that isn’t already answered on either this site or Lombard’s, you can ask me and I’ll answer it publicly if I can.

How do I create CSV files? The specific instructions on how to do this differ depending on which application you are using, but in Excel and SPSS I believe you use either the “Export” or “Save As” command and select “CSV” or “Comma-Separated Values” as your file format. It is important to remember that merely changing a file’s extension manually to “.csv” does not convert the file format itself to CSV; you must use your application’s Export or Save As function.

What are CSV files and why does ReCal require them? CSV stands for “Comma-Separated Values” and is a non-proprietary method of representing tabular (spreadsheet) data that can be read and exported by a wide range of applications (wikipedia entry here). ReCal requires CSV files because doing so maximizes compatibility across software applications and operating systems.

Why doesn’t ReCal work with Excel/SPSS/Word/[insert other proprietary software package here] files? See the answer to the question above.

My data cannot easily be reformatted to conform to ReCal’s specifications. What alternatives are available? Click here to view a list of alternative reliability calculators.

What happens to my data when I submit it to ReCal? Your data file is uploaded to a private folder on my web hosting account for troubleshooting purposes. In lieu of actual user feedback, reviewing user data directly is the only way I can identify and fix bugs. See the fine print for more info.

Is there a version of ReCal forthcoming that will perform similar analyses on ordinal and interval data and that will accept missing data? ReCal can now accept ordinal, interval, and ratio data via ReCal OIR. However it is still unable to accept datasets with missing data. If your data is incomplete, one strategy is to perform casewise deletion—that is, to delete all cases which were not evaluated by all coders. Of course, you would have to do this manually before submitting your file to ReCal. Casewise deletion is probably best used when the number of incomplete cases is small, but then, content analysis data sets with large amounts of missing reliability data are problematic from a broader validity standpoint.

Generally, if you’re interested in certain new features, please let me know via comment or email—knowing that users actively want a particular function makes me more likely to develop it.

What are the functional differences between ReCal2 and ReCal3? There are two main differences between the two:

  • ReCal for 2 Coders (ReCal2) can calculate reliabilities for multiple variables at once, whereas ReCal for 3+ Coders (ReCal3) can only calculate reliability for one variable at a time. If you have several variables all coded by two coders, the former edition might save you some time.
  • Although the two utilities share a formally identical data format, they make very different assumptions about what that data represents. ReCal2 assumes that data columns come in pairs, i.e. that columns 1 and 2 represent two coders’ codes for a single variable, cols 3 and 4 represent two coders’ codes for a different variable, etc. By contrast, ReCal3 assumes that each column in the input file represents a different coder’s work on a single variable. Therefore, the same 6-column CSV file would represent 3 different variables coded by 2 coders each to ReCal2, while ReCal3 would interpret it as one variable coded by 6 different coders. For this reason, the only files for which both ReCal2 and ReCal3 will give accurate results are those containing only 2 columns/coders. Submitting data intended for one edition to the other will generate incorrect results!

12 comments

  1. I am interested in the difference between analyzing the data for simple “matches” versus “no-match” and the weighted Kappa, that considers how far away are the misses.

    SPSS does not calculate weighted kappa, do you?

    Thanks

    Andrew

  2. Andrew,

    Thanks for letting me know; I’ll take your suggestion under advisement next time I have occasion to add features to ReCal. If anyone else is interested in weighted kappa or other features, please leave a comment so that I can gauge demand.

  3. This is an excellent resource – free and easy to use!

    The only question I have is whether your calculations include the bootstrapping algorithm described by Hayes & Krippendorff (2007). Could you clarify this in your FAQ’s? This would help me decide whether to actually put in the extra work to run the macro in SPSS.

    Thanks so much!

    1. naush,

      ReCal does not implement Hayes and Krippendorff’s bootstrapping algorithm. If you need to compute confidence intervals for your KA coefficients, you will need to use their SPSS macro.

  4. Hello,

    After computing with ReCal, I found a negative krippendorff’s alpha… But I think it’s impossible!
    Is it a computing mistake or a problem with my data ?
    Thanks

      1. I didn’t see any mention of how Krippendorff’s alpha can be negative or why, but we are getting very similar results, and are confused by this. Can you address why and when such results will occur? Do the guidelines about whether we can use the results when Krippendorff’s alpha are below 0.667 still relevant when the value is negative? (I would think so, because in samples that produce the negative result, there is very little agreement). Thank you! ~C

  5. I was wondering if there is anyway of utilizing your program with a file size over 100,000 bytes? I am currently working on a very large data set that is roughly 150,000 bytes.

    Thank you

  6. I have used this site with great success before but now when I upload my document (which is in a CVS format and follows all of the criteria) I am taken to a page that says:

    Not Found
    Apologies, but no results were found for the requested archive. Perhaps searching will help find a related post.

    I am not sure how to proceed because I am not getting an Error message.

    Please help!

  7. Hello Professor Freelon,

    I´m kind of desperate because I really don´t know how to turn my codes into a file ReCal is able to work with. I´m not sure if it´s possible at all..
    I´m doing a content analysis to examine the awareness for risky elements in sexual scripts.
    My problem is: I have got a lot of categories (150) and they are not mutually exclusive because the material consists of answers to open questions so often there are a lot of categories mentioned.
    So I guess it´s a scaling problem?
    I doubt it is possible to turn my data into a working format for ReCal (and this is really sad because it seems to be great) but maybe you can give me some advice which tool I could use instead.

    Thank you!

    Lena

Leave a Reply to besacier nathalie Cancel reply

Your email address will not be published. Required fields are marked *