ReCal (”Reliability Calculator”) is an online utility that computes intercoder/interrater reliability coefficients for nominal, ordinal, interval, or ratio-level data. It is compatible with Excel, SPSS, STATA, OpenOffice, Google Docs, and any other database, spreadsheet, or statistical application that can export comma-separated (CSV), tab-separated (TSV), or semicolon-delimited data files.

ReCal consists of three independent modules each specialized for different types of data. The following table will help you select the module that best fits your data. (If you do not know whether your data are considered nominal, ordinal, interval, or ratio, please consult this Wikipedia article to find out more about these levels of measurement.)

Level of measurement |
N of coders |
Use |

Nominal | 2 coders only | ReCal2 (includes percent agreement, Scott’s pi, Cohen’s kappa, and nominal Krippendorff’s alpha) |

Nominal | 3 or more coders | ReCal3 (includes pairwise percent agreement, Fleiss’ kappa, pairwise Cohen’s kappa, and nominal Krippendorff’s alpha) |

Ordinal, interval, or ratio | Any N of coders | ReCal OIR (includes ordinal, interval, and ratio Krippendorff’s alpha) |

Please visit the ReCal FAQ/troubleshooting page if you have questions or are experiencing difficulty getting ReCal to work with your data. If you still have questions please contact me directly rather than leaving a comment.

**Want to support ReCal? The best way is with a citation to one or both of the following articles in your final manuscript**. PDFs of both are linked below.

ReCal’s source code was last updated on **01/15/2013**. To date, ReCal (2, 3, and OIR combined) has been successfully executed a total of times by persons other than the developer.^{1}

^{1}This counter was reset to zero sometime in late 2014 under unknown circumstances. On 2/18/15 I manually reset it to the combined cumulative Google Analytics hit count for ReCal2, ReCal3, and ReCal OIR.

Wow! This simple tool instantly makes content analysis a more desirable and easier method! Thank you!

This was very simple to use, and (I think) it worked beautifully. Thanks for building it and making it available.

Wonderful tool, and I’ll recommended it to others. ReCal is especially helpful for data in an Excel spreadsheet, because Excel has no easy way for calculating intercoder reliability.

I’m using this to create some examples for the research methods class I’m teaching. Thanks for providing such a helpful tool!

It’s quite useful. Thanks a lot!

This is so useful. I was looking everywhere for a decent app, and to have it web-based is just great!

Thanks for helping me to beat deadline on a big (for me, anyway) conference paper.

A very useful product, but I would strongly encourage you to give users a viable option for exporting the results. I have tried every way know to man and I just can get the data into a useful (reportable) format.

Thanks. This was extremely helpful. So far I have only used the tool with sample data but will return when the second coder has finished.

very good job!

thank you so much! be sure that i will cite the link in my Phd thesis.

keep on your useful work!

regards,

Chrysi Rapanta

this was so easy to use- thank you! It saved me alot of work!

This was absolutely amazing (and absolutely free); so quick and simple and the guidelines were excellent and easy to follow. Thank you SO much!

A really useful tool. Many thanks

Many thanks for providing this service. Infact, this is great service for those who intensely need it. Once again thank you very much and please try more to help others in the same way.

WISAL

ASS Professor, KUST

Absolutely brilliant!

Outstanding service. Easy to use program with clear and concise output. Many thanks.

Very handy! Thanks for making this available.

It certainly saves me lots of sleepless nights looking for the solution. Timely for my final touches on the thesis. Certainly an excellent invention.And thanks for earlier reply to my email.

Great tool…this is first time I am using ReCal and I have only words of admiration for it!

So simple yet so great. I will definitely reference it in paper I am ready to publish.

Thanks a lot,

o

This is very easy and quick reliability test.

Thanks for your tool!!

Thank you so much for making this available to frantic students! Wonderfully helpful and easy to use!! Appreciatively, Michele

Thank you so much for a great tool. But, I hope you can help me clear up a discrepancy I’ve noticed in my results for variables that have the same number of agreements/disagreements. For example, variable 1 has 26 agreements and 1 disagreement. So does variable 3. So does variable 5. Yet, the results for variable 1 are: 96.3% agreement and Scott’s pi of 0.924. The results for variable 3 are: 96.3% agreement and Scott’s pi of 0.914. The results for variable 5 are: 96.3% agreement and Scott’s pi of 0.886. Can you please tell me why the Scott’s pi is different for each variable when all the raw data for them is the same (ie same number of agreements and disagreements)? This scenario has occurred on three separate occasions when I’ve submitted my .csv files for analysis.

Best wishes

Dianne

EDIT: I have answered this question.Ok, I am stumped. How can I have a percent agreement of .97 and a Scott’s Pi of-.015? I have two coders coding either Yes (1) or No (0) for the presence of a variable. What am I doing wrong. I find when calculating by hand I get similar results (off by a decimal or so). When using RECAL or calculating Scotts Pi with more than two categories, I don’t get negative Scotts Pi when the percent agreement is high.

Thanks so much for sharing your program and answering my question if you have the time.

Happy Holidays!

Sonya

EDIT: I have answered this question.easy and convenient to use. It is faster than SAS MACRO that I am using.

Great job!!!

Bob

Thanks for this great tool , before I visited this website I used PRAM and the macro of Krippendorf in SPSS. But this tool is indeed faster and very handy! My ompliments to you! I shall recommend this website to toher researchers. Eric

I will use this site for testing my content analysis results during my phd research! The topic of my research is quantitative content analysis of student’s reflective writings in Teacher Education. Eric

Very usefull. Thank you for your effort on making content analysis an easier job. I will tell everyone about this tool. Keep up the good work!

Dear Mr. Freelon,

you helped us a lot. We have a huge amount of data

out of different studies in which we analysed children’s

stories about their pain experiences concerning an

accident by bicycle, headaches, abdominal pain and

getting a vaccination by the doctor. These stories

were all analysed by 2 independent coders and we

already knew, that only calculating the percentage

agreement is not enough. Then we read the paper of

Prof. Lombard and got to know, that you developped

a programm, doing all necessary calculations to

present the intercoder – reliability as requested.

We can’t thank you enough for this great work.

It helps us and atleast about 20 students

who are working in our project.

Thank’s a lot. Others will write you, as soon as

they use your programm.

Best regard

Gaby Ostkirchen

Ivana Tolic

Dear Mr. Freelon,

thank you for your helpful tool.

I am member of the study-group of Mrs. Dr. Ostkirchen.

After calculating the reliabilities we hat categories

with bad results. That’s why we startet an analysis

of the mistakes. This helped us to improve our

category system.

Best regard

Hildegard Lüdecke

Thank you, Deen! What a fantastic program — made my intercoder reliability calculation easy. Thanks as well for answering my questions via email. You’re the best!

Hi There, Thanks for making this tool available as it provides a quick and easy way to work out reliability.

Well done!

Paul, South Australia

Dear Mr. Freelon,

would it be possible, you send us your opinion on our

problem?

Since we calculate intercoder-reliability for different sub-studies of our project with your programme we easily get the reliabilty-results, including the amount of coder-differences.

We learnt that our coding-system ameliorated over the

time, and we started to use your programme to help us

to sharpen up our category-system by adding examples and by reformulating the rules. Hildegard did a complete analysis of the mistakes (disaggrements) found by ReCal2 and up to now 5 mistakes are remaining. In her thesis she wants to

present the first calculation with about 60 disagreement, than a table with all commented disaggrements and then she executes a new reliability analysis and of course nearly

all categories show an aggrement of 100%.

Can we do it like this? Or do you propose another

way? We find it very necessary to sharpen up our system through the process we explained above.

An expert of methods like you, has he any arguments against this procedure?

Thanking you in anticipation for you soon reply

we remain with best wishes

Gaby and Hildegard

Thank you! This tool was immensely useful for content analysis research. I was about to resort to calculating krippendorf’s alpha by hand. You saved me hours of time!

I’m a somewhat cynical person that truly believes, “If it sounds too good to be true, it probably is.” So I was waiting for the catch with this website tool. There wasn’t one. THANK YOU!!! What a great service.

Thank you so much. This was immensely helpful with my research. What a wonderful tool!

This tool is great. I’ll definitely be sharing this with colleagues. It’s an easy to use solution to calculating reliability. Thanks!

Thank you so much for developing this – it is super cool and I have found it incredibly useful over the past few days. To other users – it has a quick learning curve (just a few tries to get used to the data formatting requirements), but it is worth it. Reliability tests that used to take hours are literally done in about 10 minutes.

Thank you! This has helped my research so much and you can see the quality care that you have put into this on the website.

ReCal is simply amazing! Many of the other tools are not so user-friendly and some not available. ReCal made it for me within second. It only require careful formating of data and you are there in minutes!I find ReCal very useful and i am going to extend this knowledge to others. Thanks a lot!

Thank you for creating this program. It worked well and saved hours of time. The frustration with all the methods of checking intercoder reliability that take into account agreement based on chance is that they tend to be very conservative when events are rare. It would be nice to include Perreault and Leigh’s measure which tends to be more liberal. Of course, some critics say it is too liberal. I think it’s a good idea to include multiple measures of reliability, at least one that tends conservative and one that tends liberal.

Very useful. Used it when my SPSS license died and I needed an analysis right away.

Thank you, thank you, thank you! Just found ReCal and it made my life so much easier.

This saved us so much time and energy. Simply awesome! My professor and collegues are all using this. Thank you for this great gift!

Vivi Xie

I cant tell how useful this website has been for my research!!!

I am doing my PhD and this software was just TERRIFIC!!!!!

No more headache looking for calculators..much better than SPSS that i am using which only offers Kappa…

Thank you so much for making this utility, I used recal to calculate the results for my thesis experiment and mentioned ReCal and the article.

Two words: Thank. you.

Welcome!

Thank you thank you thank you sooo much. I am absolutely hopeless with numbers and this makes sense even to me!!! :’) really thanks.

Great job guys! Thanks for the enormous help!

What a great tool…thank you

I found ReCal very very useful. Easy to use. Just the tool for quick and efficient work. Unfortunately, the tool does not raise the inter rater reliability itself ;-).

Thanks a million.

Fantastic site.

So glad to find this. We’re pretesting some questionnaires at our partner sites here in Cambodia. Your reliability tool was a great find, and saved me a lot of time! Thank you!

Thank goodness for ReCAL! A real tremendous help! Thanks again!

Fantastic site and great concept! Well done

Thank you very much!!! This is great.

Thanks for making this!

This was so fast! Really amazing and easy to use. The instructions are easy to follow. Thanks!

I couldn’t get the R (CRAN) package irr / LpSolve to load & install successfully, as it complains about a missing dependency (which I thought I loaded). Can’t spend my life on that, so this resource is jolly useful to me!

Thanks.

This a absolutely amazing, saved me of so much trouble and I also get to triangulate my results. A big thank you

This has been a phenomenal help to my research project. Thank you for providing this great utility!

Has anyone had any issues from journal editors and/or reviewers when using this service to calculate Cohen’s kappa?

Fantastic!

I have used R successfully for statistical analysis in the past…but for whatever reason couldn’t get packages “irr” or “concord” to work. After hours of frustration, your site popped up in a google search and I’m forever grateful!!! If you think about expanding the options in the future, it would be great to see some other kappa options for those of us with bias or prevalence issues in our coder data

A note to mac users – my csv file wouldn’t upload correctly until I used parallels w/internet explorer…I’m not sure why but if you have issues that could solve them.

Cheers!

This tool is very useful for me. Thank you very much!

I spent about 6 hours mucking my way through other calculators/SPSS/Excel trying to get an IRR I could use. 20 minutes here, and I’ve got the scores I need! Wow, I can’t thank you enough!!!

Hello

I found your website for inter-coder reliability calculation from your paper in Internet Science journal. However I have a problem with it. when I upload my data file it shows a high agreement percent but the Cohen Kappa coefficient becomes negative. I appreciate if you can help me.

I appreciate your urgent copperation in advance.

Regards

Hamed

Thanks for this great tool, especially its easy handling. Was very helpful concerning my master thesis.

Do you have a preference for how you want your work cited?

For file names like AB_test.csv, ReCal3 does something to the filename in its report: it becomes _test.csv. Pretty inconvenient if the part before the underscore is identifying different versions.

While I can rename my files to avoid this, it would probably be good if ReCal3 would respect any filenames it is fed (if not too outlandish).

Ok, this has been fixed.

Many thanks for making this terrific program available.

Do you have plans for a version that calculates Krippendorff’s alpha with missing data? K’s method apparently allows for this. Gwet’s Agreestat program supposedly handles missing data, but when I downloaded the trial version of that the security routines where I work thought it was unsafe to run and refuesed to allow it.

I do in fact have plans to add support for missing data to ReCal OIR (to which I will also add KA for nominal data). In fact I’ve already done most of the work, but I still need to test the algorithm to eliminate potential bugs. The bad news is I probably won’t be able to release the update until this summer–projects that count for tenure come first!

Thank you very much! I’ve just used ReCal which has successfully calculated inter-rater reliability for a coding scale I’m using for my research.

So great to have this resource publicly available – thank you

Thank you so much. Your site is very helpful and your efforts are much appreciated

I am looking to calculate intercoder reliability for a time variable reported in minutes to a single decimal place. Can ReCal deal with this or can it only use whole numbers? I am concerned with inflating the coefficient if I multiply the values by a factor of 10.

Thank you!

A colleague directed me to this site for calculating Krippendorff’s Alpha. I am having trouble understanding Krippendorff’s explanation of how the # of pairs are calculated.

As a check, I’ve entered the data from two of Krippendorff’s examples (the 3×15 matrix in Wikipedia and the 4×12 matrix in Krippendorff’s 2011.1.25 paper referenced on this web page). In both cases I’m getting different results from the web page and “reference” documents. I am not sure if I’m doing something wrong or if there is a problem with the algorithm on this web page.

Wikipedia alpha = 0.811, ReCal3 = 0.235

Example C alpha = 0.743, ReCal3 = 0.577

Since this site saves examples, the uploaded data files are Alpha_Wikipedia.csv and Alpha_XamplC.csv.

I would greatly appreciate guidance/ suggestions regarding why the discrepancy in alpha values.

Thanks, – Andy -

Hi. One problem you may be experiencing is the fact that ReCal does not currently accept files with missing data (and states as much in the instructions, though I plan on adding support for missing data this summer). So you can’t get accurate results for the Wikipedia example, and I’m not sure which Krippendorff 2011.1.25 paper you’re referring to–it’s not referenced on my site–but the same would be the case if data are missing from that example.

If your files contain missing data I suggest you use either Andrew Hayes’ macro for SPSS/SAS or the R package “irr,” both of which are linked from the Wikipedia page. Alternatively you could use ReCal if you first perform listwise deletion of missing data, as I suggest on the FAQ page. But check back in a few months–I’ve actually already written the code to add missing data support, but I need to test it before I roll it out.

I am working on my first piece of research so am completely new to testing. I am unsure of how to enter my data as the example says it uses 6 coders info for 1 variable. I have 10 variables/statements 40 participants and ordinal data as a response to a statement ( number between 1- 5 ). Can you explain how I should set out the data in Excel to then imput it here to run Krippendorfs alpha? Thank you so much.

Have you considered open sourcing the PHP you’re using to do the calculations? Your site is incredibly useful but I’d like to be able to automate some calculations in a way that’s a little more elegant and reliable than screen-scraping your site. If you’re open to sharing in any way, please email me to discuss.

Hi all,

Wondering if anyone can tell me how I can access this software to run the analysis on inter-rater reliability with three coders.

Thank you.

If you let me know which of the instructions here confused you: http://dfreelon.org/utils/recalfront/recal3/ , I can help you individually. It might be easier to contact me directly here: http://dfreelon.org/contact/

Incredible! Thank you so much!

Hi Deen. Meredith from UNC-JOMC here. Thanks for the contribution to my dissertation research! I will definitely provide propers and kudos.

ReCal is a very powerful and easy-to-use tool. Thank you for making this available to the public!

Super useful tool! Thanks for making it.

Thank you so much! This made assessing my intercoder reliability coefficients so much easier than any of the other programs!

Hi Deen,

A few replies above this you mention that you’ll be rolling out support for absent data shortly – any idea when this will be?

Thanks

Mark Zuckerberg: Left school…became successful.

Steve Jobs: Left school… became successful.

Bill Gates: Left school… became successful.

But if I left school… I’d become an unsuccessful hobo.

I have two coders and 200 articles that they have each coded. Do I have to run reliabilty test for every pair of articles? If so, that means I will have 100 reliability coefficients – I’m lost – any help would be awesome.

Many thanks for making this tool available. Krippendorff’s stats are not easy to calculate, so this is extremely helpful.

Thanks a lot! Very useful. I’ll definitely cite your references

I spent hours trying to figure out how the calculation works via SPSS and Excel, and I ended up getting all the outcomes I needed nice and quick from ReCal in less than 3 minutes! This is an extremely helpful website not only it is time-saving but also very user-friendly. I wish I had known this website a lot earlier, and I’m definitely citing your references. Thank you so much for building this service!

Awesome! Thank you!

WOW! What a lifesaver. I have been looking for something easy to use and this was and it worked! THANK YOU

Very easy to use and super fast! Thank you for providing this useful tool.

Can’t believe it works. So efficient. Very many thanks.

Thank you so much for this tool. I can’t believe how efficient and easy to use this is. Much appreciated for students such as me.

It is reliable, I cross checked with SPSS. It is quicker than SPSS.

This tool is simply awesome. Thanks a lot for sharing.

I also compared the values to SPSS. They match!

Thanks so much. Simple tools that do a single thing really well are a delight! Will be citing in a paper.

Contact me if you want the ref.

Really helpful and simple tool to use – many thanks!

Very cool (and fast)! I used this first, and then used the kap command in Stata. Same exact results (the only thing ReCal didn’t report (that Stata did) was the Z-statistic/p-value for the kappa statistic).

Absolutely awesome! Just as accurate as SPSS, but quicker and more efficient. Thank you for this!!!

Hi,

Thanks for this wonderful software. However, I have some concerns. I found high percentage agreements for some of my variables, but a somewhat low scott pi. For instance, 2 categories showed 96% agreement, with scott pi of .79 and .78 respectively. Another showed same 96% agreement and scott pi of 0.94. Yet another, 97% agreement and scott pi of 0.71. I need to know how the software calculated for scott pi, and why these differences in results. Please help ASAP. I need to know for my defense.

See Deen’s earlier post re: the difference between simple agreement vs. the calculations underlying reliability coefficients (pi, k-alpha, et. al) – which account for chance agreement, among other attributes of your data

I have a dataset with nominal data (2 raters using 5 categories to rate 25 forms). Would I convert each ‘match’ between raters to “1” and “1” and each ‘non-match’ to “1” and “0” for the csv file? Does that make sense?