The goal of this post is to prove the following elementary lower bound for off-diagonal Ramsey numbers (where
is fixed and we are interested in the asymptotic behavior as
gets large):
The proof does not make use of the Lovász local lemma, which improves the bound by a factor of ; nevertheless, I think it’s a nice exercise in asymptotics and the probabilistic method. (Also, it’s never explicitly given in Alon and Spencer.)
The alteration method
The basic probabilistic result that gives the above bound is actually quite easy to prove, and is an example of what Alon and Spencer call the alteration method: construct a random structure, then alter it to get a better one. Recall that the Ramsey number is the smallest positive integer
such that every coloring of the edges of the complete graph
by two colors (say blue and yellow) contains either a blue
or a yellow
.
Theorem: For any positive integer and any real number
, we have
.
Proof. Consider a random coloring of the edges of in which an edge is colored blue with probability
and yellow with probability
, and delete a vertex from every blue
or yellow
. How many vertices will be deleted, on average? Since the expected number of blue
‘s is
, and the expected number of yellow
‘s is
, it follows that the expected number of vertices to be deleted is at most their sum (it should be less since deleting one vertex may mean we do not have to delete others). Here we are using the fundamental fact that a random variable is at most (equivalently, at least) its expected value with positive probability, which is trivial when the sample space is finite.
This means that, with positive probability, we delete at most the expected number. The result is a coloring of the complete graph on vertices with no blue
or yellow
, hence
must be greater than this number.
Optimization
Now our goal is to choose close-to-optimal values of . When
it turns out that this method only gives a small improvement over Erdős’s classic bound
(“the inequality that launched a thousand papers”), but when
is fixed and we are interested in the asymptotics as a function of
then we can do quite a bit better than the obvious generalization of Erdős’s bound.
We will choose first. For large
the important term to control is
; if this term is too large then the bound above is useless, so we want it to be small. This entails making
small, hence making
large. However,
will overwhelm the only positive term
if we choose
too large. Since we are willing to lose constant factors, let’s aim to choose
so that
since the positive contribution from will still occur up to a constant factor. Using the inequality
(which is good enough, since
is fixed) we see that we should choose
so that
, so let’s choose
. This gives
.
Now we need to choose . To do this properly we’ll need to understand how the second term grows. This requires two estimates. First, the elementary inequality
(which one can prove, for example, by taking the logarithm of both sides and bounding the corresponding Riemann sum by an integral; see also Terence Tao’s notes on Stirling’s formula) gives
.
Second, the elementary inequality (by convexity, for example) gives
.
Let me pause for a moment. I was recently not assigned this problem as homework in a graph theory course. Instead, we were assigned to prove a weaker bound, and only for . When I described the general argument to my supervision partner and supervisor, they commented on the “weird” (I forget the exact word) estimates it required, and didn’t seem particularly interested in the details.
These estimates are not weird! In order to get any kind of nontrivial lower bound it is necessary that , and in order to prevent the second term from overwhelming the first it is necessary that
. In this regime, to estimate
when both
and
goes to infinity requires more detail than the trivial bound
, and the detail provided by the above estimate (which ignores the small corrective factors coming from the rest of Stirling’s formula) is exactly suited to this problem. And in order to estimate
it is perfectly natural to use the exponential inequality, since the exponential is much easier to analyze (indeed, bounding expressions like these is in some sense the whole point of the function
). These are not contrived expressions coming from nowhere. The reader who is not comfortable with these estimates should read something like Steele’s The Cauchy-Schwarz Master Class and/or Graham, Knuth, and Patashnik’s Concrete Mathematics.
Back to the mathematics. By our estimates, the logarithm of is bounded by
.
We want to choose as large as possible subject to the constraint that this logarithm is bounded by
or so. To get a feel for how the above expression behaves, let’s set
for some
. This gives
.
The first term is while the second term is
, so to get these terms to match as close as possible we’d like
to be slightly smaller than
. To get the logarithmic factor to match, we’ll set
for some constant . This gives a bound of
.
The dominant term here is . We’d like this to be less than
, which requires the coefficient of this term to tend to zero. But as long as it does so sufficiently quickly, modifying
beyond that will only lead to a constant change in
(which is the main contribution to our lower bound), so we’ll cheat a little: we’ll make the coefficient negative so it overwhelms the other terms, which will ensure that its exponential tends to zero, giving an estimate
. This requires that
, hence
, so we will take
.
This gives the final estimate
as desired, where the implied constant is something like (although I have made no particular effort to optimize it).
As for the question of what is currently known, see this MO question. Up to logarithmic factors, it seems the best known lower bound grows like (which is what the local lemma gives), while the best known upper bound grows like
, and the latter is conjectured to be tight. For
a result of Kim gives the exact asymptotic
.