Goodman and Kruskal (1954) proposed an association measure (tau) for nominal variables based on variation measure
a. Show V(Y) is the probability that two independent observations on Y fall in different categories (called the Gini concentration index).
Show that V(Y) = (J when π+j = 1 for some j and V(Y) takes maximum value of (J – 1)/J when π+j = 1/J for all j.
b. For the proportional reduction in variation, show that E[V(Y|X)]= 1 – ∑i ∑j π2ij/πi+. [The resulting measure (2.12) is called the concentration coefficient. Like U, τ = 0 is equivalent to independence. Haberman (1982) presented generalized concentration and uncertainty coefficients.]