# 人在江湖

Kendall tau是用来度量关联关系的。

Let (x1, y1), (x2, y2), …, (xn, yn) be a set of joint observations from two random variables X and Y respectively, such that all the values of (xi) and (yi) are unique. Any pair of observations (xi, yi) and (xj, yj) are said to be concordant if the ranks for both elements agree: that is, if both xi > xj and yi > yj or if both xi < xj and yi < yj. They are said to be discordant, if xi > xj and yi < yj or if xi < xj and yi > yj. If xi = xj or yi = yj, the pair is neither concordant nor discordant.

The Kendall τ coefficient is defined as:

A pair {(xi, yi), (xj, yj)} is said to be tied if xi = xj or yi = yj; a tied pair is neither concordant nor discordant. When tied pairs arise in the data, the coefficient may be modified in a number of ways to keep it in the range [-1, 1]:

Tau-b statistic, unlike tau-a, makes adjustments for ties and is suitable for square tables. Values of tau-b range from −1 (100% negative association, or perfect inversion) to +1 (100% positive association, or perfect agreement). A value of zero indicates the absence of association.

The Kendall tau-b coefficient is defined as:

where

sas程序举例：

data color;
input Region Eyes $Hair$ Count @@;
label Eyes  ='Eye Color'
Hair  ='Hair Color'
Region='Geographic Region';
datalines;
1 blue  fair   23  1 blue  red     7  1 blue  medium 24
1 blue  dark   11  1 green fair   19  1 green red     7
1 green medium 18  1 green dark   14  1 brown fair   34
1 brown red     5  1 brown medium 41  1 brown dark   40
1 brown black   3  2 blue  fair   46  2 blue  red    21
2 blue  medium 44  2 blue  dark   40  2 blue  black   6
2 green fair   50  2 green red    31  2 green medium 37
2 green dark   23  2 brown fair   56  2 brown red    42
2 brown medium 53  2 brown dark   54  2 brown black  13
;

proc freq data = color noprint ;
tables  eyes*hair / measures  noprint ;
weight count;
output out=output KENTB;
test KENTB;
run;

Somers' D(C|R) and Somers' D(R|C) are asymmetric modifications of tau-b.Somers' D differs from tau-b in that it uses a correction only for pairs that are tied on the independent variable.

