First, some background:
Remember that the mean or expected value of a random variable is defined as the integral of that variable with respect to the associated probability measure. This works out great in some cases, e.g. when we're flipping coins, or rolling dice, or in general when we're thinking about random variables with finite variance. Of course, it's defined for a wider class of random variables -- not everything with a mean has a standard deviation -- but there are certainly random variables out there with no mean.
A famous example of a divergent mean is the
St. Petersburg game, which goes something like this: I'll let you flip a coin and pay you for how many heads you can get in a row. If you get heads up the first time, but tails after that, I'll give you a dollar. Two heads in a row, $2. For
n heads in general,
n > 0, the payout will be $2
n-1. Rather than giving you 50 cents for failing so spectacularly as to hit tails on your first try, let's say you just get nothing.
The probability of getting exactly
n heads in a row is, of course, 1/2
n, so the expected value is
(1/2)($1) + (1/4)($2) + (1/8)($4) + (1/16)($8) + ... = $.50 + $.50 + $.50 + ...
Uh oh. If we keep adding on $.50 forever, the right-hand side is going to infinity! Clearly, you should be willing to pay me any amount of money in order to play this game, since your measly $10k pales in comparison to the promised infinite reward. Any takers?
We're not done here, though; first, let's change up the game a little. We'll leave things the same if you flip 1 head, or three consecutively, or five, and so on, but let's say if you flip an
even number of heads,
you have to pay
me. What was that expectation again?
(1/2)($1) - (1/4)($2) + (1/8)($4) - (1/16)($8) + ... = $.50 - $.50 + $.50 - $.50 + ...
Oh, dear. Depending on how we arrange that series, it can diverge up
or down. You can see from here, if you didn't sleep through your calculus classes, that we can even produce a conditionally convergent expectation that can be rearranged to
anything!
Okay, I've gotten away from myself here, but, long story short, means are
weird. (I seem to be in an Italic mood today for some reason...) That said, consider the following, which has been bouncing around in my head for a year or better. I finally asked someone whether it was true and quickly received the following proof:
Let X1, X2, ... XN be i.i.d. random variables, and let YN = [ X1 + X2 + ... + XN ] / N. Put f(N) = Median(YN). If μ = Mean(Xi), i.e. if that mean exists, then as N → ∞ we have f(N) → μ.
Proof (thanks to Robert Israel): The Weak Law of Large Numbers states that, for ɛ>0, Pr{|YN - μ| < ɛ} → 1 as N → ∞. For N sufficiently large, then, Pr{|YN - μ| < ɛ} > 1/2, or in other words, more than half of the distribution for YN lies within the interval (μ - ɛ, μ + ɛ). In particular, at least half lies within (-∞, μ + ɛ), so there is no way that the median f(N) can exceed μ + ɛ; symmetrically, f(N) > μ - ɛ. Combining both sides, we see that |f(N) - μ| < ɛ, which is to say that f(N) → μ as N → ∞.
Note that this limit can converge in circumstances where there is no mean -- for instance, it clearly always gives the axis of symmetry for a symmetric distribution, and there exist symmetric distributions which have no mean, e.g. the
Cauchy distributions. So in broad terms this is a generalization of the concept of the mean.
I'm really no expert on statistics or probability, so I don't really know how far the applications go or whether this is an elementary exercise in graduate textbooks in the field. I just thought it was interesting, and I'll probably poke around with it in the days to come.