Free calculator

Cohen's kappa calculator

Enter nonnegative whole-number counts in a square confusion matrix (2–6 categories). Rows are rater A and columns are rater B; the diagonal counts agreements. The calculator shows total N, observed agreement p_o, chance agreement from the margins p_e, and unweighted Cohen's κ with κ = (p_o − p_e) / (1 − p_e). It does not compute weighted κ for ordinal scales, Fleiss' κ for three or more raters, or standard errors and confidence intervals—use statistics software when you need those.

Number of categories (matrix size)

Confusion matrix (counts)

A \ B	Cat. 1	Cat. 2
Cat. 1
Cat. 2

Each cell is a whole number count (empty = 0). Total N may not exceed 1000000.

Results

Total items (N): 50
Observed agreement (p_o): 0.7
Expected agreement by chance (p_e): 0.5
Cohen's κ (unweighted): 0.4

Row totals (rater A)

25 · 25

Column totals (rater B)

30 · 20

Conventional strength (Landis–Koch style):

Fair agreement

Bands are rules of thumb—your field or journal may use different cutoffs.

Educational and illustrative only. This is not clinical decision support, not research-ethics or survey-design advice, and not a substitute for qualified statisticians when reporting matters.

When to use this calculator

Quick inter-rater reliability checks for two raters on the same nominal category set—before you paste the same table into Sheets or Excel.

Coding studies, content analysis, or labeling tasks with two independent annotators.
Teaching κ next to raw percent agreement so learners see why chance matters.
Sanity-checking a confusion matrix export from another tool against textbook p_o, p_e, and κ.
Need spread or normal tail readouts instead? Use the z-score or standard deviation calculators for different questions.

How do you calculate Cohen's kappa from a confusion matrix?

Cohen's kappa compares observed agreement p_o to p_e, the agreement expected by chance when each rater's category frequencies are independent. κ = (p_o − p_e) / (1 − p_e) when p_e < 1.

Build the k × k table

Each cell n_ij counts how many items rater A placed in category i while rater B placed the same items in category j. The diagonal counts both raters agreed.

Observed agreement p_o

p_o = (Σ_i n_ii) / N where N is the sum of all cells.

Chance agreement p_e

Under independence, p_e = (Σ_i r_i c_i) / N² where r_i is the ith row total and c_i the ith column total.

Unweighted κ

κ = (p_o − p_e) / (1 − p_e). Negative κ means agreement worse than chance would predict at those margins.

Weighted κ, Fleiss' κ, standard errors, confidence intervals, and sample-size or power planning are outside what this calculator does.

For standardization on a normal model, see the z-score calculator.

For spread from a pasted list, see the standard deviation calculator.

For percent change language separate from κ, see the percentage calculator.

Google Sheets & Excel

There is no single built-in KAPPA function in Google Sheets or Excel like STDEV.S. Build row and column sums with SUM, compute p_o from the diagonal divided by N, p_e from the sum of row_i × column_i products divided by N², then κ. The cards show compact patterns—replace ranges with your layout.

Observed agreement p_o

=(SUM(diagonal_range))/N_cell

Put N in a cell (e.g. =SUM(matrix_range)). p_o is the sum of the main diagonal divided by N.

Expected agreement p_e

=SUMPRODUCT(row_margins, col_margins)/N_cell^2

row_margins and col_margins are length-k vectors of SUM across each row/column. This matches independence of the two marginal distributions.

Cohen's κ from p_o and p_e

=(p_o_cell-p_e_cell)/(1-p_e_cell)

Guard 1 − p_e near 0—if p_e = 1, κ is undefined.

More tools in Statistics

Browse all tools

Z-score calculator
Open tool

Frequently asked questions

What is Cohen's kappa?

Cohen's kappa (κ) measures inter-rater agreement on categorical labels for two raters, adjusting for agreement expected by chance from the row and column totals. This page implements the unweighted form on a square count matrix.

What is the formula for Cohen's kappa?

With p_o = proportion on the diagonal and p_e = Σ_i (row_i × col_i) / N², κ = (p_o − p_e) / (1 − p_e) when p_e < 1. Intuitively: compare observed agreement to chance, scaled by how much room there is above chance.

Why does the calculator say κ is undefined?

When p_e rounds to 1, the denominator 1 − p_e is zero—the usual κ formula does not apply. That often happens when all items fall in one category for both raters (or other degenerate margin patterns). Re-check coding and whether κ is the right summary.

Can Cohen's kappa be negative?

Yes. κ < 0 means observed agreement is below what independence of the margins would predict—worse than chance on this definition.

Is Cohen's kappa the same as percent agreement?

No. Raw percent agreement is usually diagonal / N. κ subtracts a chance baseline p_e built from the two marginal distributions, then rescales by 1 − p_e so high margins do not inflate a simple %.

Do you support weighted kappa for ordinal categories?

Weighted κ uses a weight matrix for how “close” off-diagonal disagreements are. For linear or quadratic weights, use a statistics package or survey platform built for ordinal agreement.

Can I use this for three or more raters (Fleiss' kappa)?

No. Fleiss' κ generalizes agreement to multiple raters per item. This page is Cohen's κ for exactly two raters on a single paired classification table.

My categories are ordered (Likert). Is unweighted κ OK?

Unweighted κ treats any off-diagonal disagreement the same—whether categories are adjacent or far apart. For ordinal scales, researchers often use weighted κ instead. This page stays with unweighted κ from your counts.

How do I match this in Excel or Google Sheets?

Use SUM for row/column margins and N, SUMPRODUCT(row_margins, col_margins)/N^2 for p_e, diagonal SUM over N for p_o, then (p_o−p_e)/(1−p_e). There is no single KAPPA function—the formula cards on this page mirror the usual spreadsheet layout.

Do you show standard errors or a confidence interval for κ?

Interval estimates need extra assumptions and formulas (for example large-N asymptotics). Export your counts to statistics software when you need SE/CI or formal hypothesis tests.

Are the strength labels on the page official?

The Landis–Koch style bands are widely cited conventions, not universal law. Journals and regulators may prefer different wording—treat the line as orientation, not a pass/fail gate.

Is this professional statistics advice?

No. It is a free educational calculator. For regulated work, clinical scoring, or publication-ready analyses, follow qualified statisticians and your organization's methods.

Explore templates Modeling services Book a consultation