A recent paper of Kra, Moreira, Richter, and Robertson established the following theorem, resolving a question of Erdös. Given a discrete amenable group , and a subset of , we define the Banach density of to be the quantity
where the supremum is over all Følner sequences of . Given a set in , we define the restricted sumset to be the set of all pairs where are distinct elements of .
Theorem 1 Let be a countably infinite abelian group with the index finite. Let be a positive Banach density subset of . Then there exists an infinite set and such that .
Strictly speaking, the main result of Kra et al. only claims this theorem for the case of the integers , but as noted in the recent preprint of Charamaras and Mountakis, the argument in fact applies for all countable abelian in which the subgroup has finite index. This condition is in fact necessary (as observed by forthcoming work of Ethan Acklesberg): if has infinite index, then one can find a subgroup of of index for any that contains (or equivalently, is -torsion). If one lets be an enumeration of , and one can then check that the set
has positive Banach density, but does not contain any set of the form for any (indeed, from the pigeonhole principle and the -torsion nature of one can show that must intersect whenever has cardinality larger than ). It is also necessary to work with restricted sums rather than full sums : a counterexample to the latter is provided for instance by the example with and . Finally, the presence of the shift is also necessary, as can be seen by considering the example of being the odd numbers in , though in the case one can of course delete the shift at the cost of giving up the containment .
Theorem 1 resembles other theorems in density Ramsey theory, such as Szemerédi’s theorem, but with the notable difference that the pattern located in the dense set is infinite rather than merely arbitrarily large but finite. As such, it does not seem that this theorem can be proven by purely finitary means. However, one can view this result as the conjunction of an infinite number of statements, each of which is a finitary density Ramsey theory statement. To see this, we need some more notation. Observe from Tychonoff’s theorem that the collection is a compact topological space (with the topology of pointwise convergence) (it is also metrizable since is countable). Subsets of can be thought of as properties of subsets of ; for instance, the property a subset of of being finite is of this form, as is the complementary property of being infinite. A property of subsets of can then be said to be closed or open if it corresponds to a closed or open subset of . Thus, a property is closed and only if if it is closed under pointwise limits, and a property is open if, whenever a set has this property, then any other set that shares a sufficiently large (but finite) initial segment with will also have this property. Since is compact and Hausdorff, a property is closed if and only if it is compact.
The properties of being finite or infinite are neither closed nor open. Define a smallness property to be a closed (or compact) property of subsets of that is only satisfied by finite sets; the complement to this is a largeness property, which is an open property of subsets of that is satisfied by all infinite sets. (One could also choose to impose other axioms on these properties, for instance requiring a largeness property to be an upper set, but we will not do so here.) Examples of largeness properties for a subset of include:
We will call a set obeying a largeness property an -large set.
Theorem 1 is then equivalent to the following “almost finitary” version (cf. this previous discussion of almost finitary versions of the infinite pigeonhole principle):
Theorem 2 (Almost finitary form of main theorem) Let be a countably infinite abelian group with finite. Let be a Følner sequence in , let , and let be a largeness property for each . Then there exists such that if is such that for all , then there exists a shift and contains a -large set such that .
Proof of Theorem 2 assuming Theorem 1. Let , , be as in Theorem 2. Suppose for contradiction that Theorem 2 failed, then for each we can find with for all , such that there is no and -large such that . By compactness, a subsequence of the converges pointwise to a set , which then has Banach density at least . By Theorem 1, there is an infinite set and a such that . By openness, we conclude that there exists a finite -large set contained in , thus . This implies that for infinitely many , a contradiction.
Proof of Theorem 1 assuming Theorem 2. Let be as in Theorem 1. If the claim failed, then for each , the property of being a set for which would be a smallness property. By Theorem 2, we see that there is a and a obeying the complement of this property such that , a contradiction.
Remark 3 Define a relation between and by declaring if and . The key observation that makes the above equivalences work is that this relation is continuous in the sense that if is an open subset of , then the inverse image
is also open. Indeed, if for some , then contains a finite set such that , and then any that contains both and lies in .
For each specific largeness property, such as the examples listed previously, Theorem 2 can be viewed as a finitary assertion (at least if the property is “computable” in some sense), but if one quantifies over all largeness properties, then the theorem becomes infinitary. In the spirit of the Paris-Harrington theorem, I would in fact expect some cases of Theorem 2 to undecidable statements of Peano arithmetic, although I do not have a rigorous proof of this assertion.
Despite the complicated finitary interpretation of this theorem, I was still interested in trying to write the proof of Theorem 1 in some sort of “pseudo-finitary” manner, in which one can see analogies with finitary arguments in additive combinatorics. The proof of Theorem 1 that I give below the fold is my attempt to achieve this, although to avoid a complete explosion of “epsilon management” I will still use at one juncture an ergodic theory reduction from the original paper of Kra et al. that relies on such infinitary tools as the ergodic decomposition, the ergodic theory, and the spectral theorem. Also some of the steps will be a little sketchy, and assume some familiarity with additive combinatorics tools (such as the arithmetic regularity lemma).
— 1. Proof of theorem —
The proof of Kra et al. proceeds by establishing the following related statement. Define a (length three) combinatorial Erdös progression to be a triple of subsets of such that there exists a sequence in such that converges pointwise to and converges pointwise to . (By , we mean with respect to the cocompact filter; that is, that for any finite (or, equivalently, compact) subset of , for all sufficiently large .)
Theorem 4 (Combinatorial Erdös progression) Let be a countably infinite abelian group with finite. Let be a positive Banach density subset of . Then there exists a combinatorial Erdös progression with and non-empty.
Let us see how Theorem 4 implies Theorem 1. Let be as in Theorem 4. By hypothesis, contains an element of , thus and . Setting to be a sufficiently large element of the sequence , we conclude that and . Setting to be an even larger element of this sequence, we then have and . Setting to be an even larger element, we have and . Continuing in this fashion we obtain the desired infinite set .
It remains to establish Theorem 4. The proof of Kra et al. converts this to a topological dynamics/ergodic theory problem. Define a topological measure-preserving -system to be a compact space equipped with a Borel probability measure as well as a measure-preserving homeomorphism . A point in is said to be generic for with respect to a Følner sequence if one has
for all continuous . Define an (length three) dynamical Erdös progression to be a tuple in with the property that there exists a sequence such that and .
Theorem 4 then follows from
Theorem 5 (Dynamical Erdös progression) Let be a countably infinite abelian group with finite. Let be a topological measure-preserving -system, let be a -generic point of for some Følner sequence , and let be a positive measure open subset of . Then there exists a dynamical Erdös progression with and .
Indeed, we can take to be , to be , to be the shift , , and to be a weak limit of the for a Følner sequence with , at which point Theorem 4 follows from Theorem 5 after chasing definitions. (It is also possible to establish the reverse implication, but we will not need to do so here.)
A remarkable fact about this theorem is that the point need not be in the support of ! (In a related vein, the elements of the Følner sequence are not required to contain the origin.)
Using a certain amount of ergodic theory and spectral theory, Kra et al. were able to reduce this theorem to a special case:
Theorem 6 (Reduction) To prove Theorem 5, it suffices to do so under the additional hypotheses that is ergodic, and there is a continuous factor map to the Kronecker factor. (In particular, the eigenfunctions of can be taken to be continuous.)
We refer the reader to the paper of Kra et al. for the details of this reduction. Now we specialize for simplicity to the case where is a countable vector space over a finite field of size equal to an odd prime , so in particular ; we also specialize to Følner sequences of the form for some and . In this case we can prove a stronger statement:
Theorem 7 (Odd characteristic case) Let for an odd prime . Let be a topological measure-preserving -system with a continuous factor map to the Kronecker factor, and let be open subsets of with . Then if is a -generic point of for some Følner sequence , there exists an Erdös progression with and .
Indeed, in the setting of Theorem 5 with the ergodicity hypothesis, the set has full measure, so the hypothesis of Theorem 7 will be verified in this case. (In the case of more general , this hypothesis ends up being replaced with ; see Theorem 2.1 of this recent preprint of Kousek and Radic for a treatment of the case (but the proof extends without much difficulty to the general case).)
As with Theorem 1, Theorem 7 is still an infinitary statement and does not have a direct finitary analogue (though it can likely be expressed as the conjunction of infinitely many such finitary statements, as we did with Theorem 1). Nevertheless we can formulate the following finitary statement which can be viewed as a “baby” version of the above theorem:
Theorem 8 (Finitary model problem) Let be a compact metric space, let be a finite vector space over a field of odd prime order. Let be an action of on by homeomorphisms, let , and let be the associated -invariant measure . Let be subsets of with for some . Then for any , there exist such that
The important thing here is that the bounds are uniform in the dimension (as well as the initial point and the action ).
Let us now give a finitary proof of Theorem 8. We can cover the compact metric space by a finite collection of open balls of radius . This induces a coloring function that assigns to each point in the index of the first ball that covers that point. This then induces a coloring of by the formula . We also define the pullbacks for . By hypothesis, we have , and it will now suffice by the triangle inequality to show that
Now we apply the arithmetic lemma of Green with some regularity parameter to be chosen later. This allows us to partition into cosets of a subgroup of index , such that on all but of these cosets , all the color classes are -regular in the Fourier () sense. Now we sample uniformly from , and set ; as is odd, is also uniform in . If lies in a coset , then will lie in . By removing an exceptional event of probability , we may assume that neither of these cosetgs , is a bad coset. By removing a further exceptional event of probability , we may also assume that is in a popular color class of in the sense that
since the set of exceptional that fail to achieve this only are hit with probability . Similarly we may assume that
Now we consider the quantity
which we can write as
Both factors here are -uniform in their respective cosets. Thus by standard Fourier calculations, we see that after excluding another exceptional event of probabitiy , this quantity is equal to
By (1), (2), this expression is . By choosing small enough depending on , we can ensure that and , and the claim follows.
Now we can prove the infinitary result in Theorem 7. Let us place a metric on . By sparsifying the Følner sequence , we may assume that the grow as fast as we wish. Once we do so, we claim that for each , we can find such that for each , there exists that lies outside of such that
Passing to a subsequence to make converge to respectively, we obtain the desired Erdös progression.
Fix , and let be a large parameter (much larger than ) to be chosen later. By genericity, we know that the discrete measures converge vaguely to , so any point in the support in can be approximated by some point with . Unfortunately, does not necessarily lie in this support! (Note that need not contain the origin.) However, we are assuming a continuous factor map to the Kronecker factor , which is a compact abelian group, and pushes down to the Haar measure of , which has full support. In particular, thus pushforward contains . As a consequence, we can find such that converges to , even if we cannot ensure that converges to . We are assuming that is a coset of , so now converges vaguely to .
We make the random choice , , where is drawn uniformly at random from . This is not the only possible choice that can be made here, and is in fact not optimal in certain respects (in particular, it creates a fair bit of coupling between , ), but is easy to describe and will suffice for our argument. (A more appropriate choice, closer to the arguments of Kra et al., would be to in the above construction by , where the additional shift is a random variable in independent of that is uniformly drawn from all shifts annihilated by the first characters associated to some enumeration of the (necessarily countable) point spectrum of , but this is harder to describe.)
Since we are in odd characteristic, the map is a permutation on , and so , are both distributed according to the law , though they are coupled to each other. In particular, by vague convergence (and inner regularity) we have
and
where denotes a quantity that goes to zero as (holding all other parameters fixed). By the hypothesis , we thus have
for some independent of .
We will show that for each , one has
outside of an event of probability at most (compare with Theorem 8). If this is the case, then by the union bound we can find (for large enough) a choice of , obeying (3) as well as (4) for all . If the grow fast enough, we can then ensure that for each one can find (again for large enough) in the set in (4) that avoids , and the claim follows.
It remains to show (4) outside of an exceptional event of acceptable probability. Let be the coloring function from the proof of Theorem 8 (with ). Then it suffices to show that
where and . This is a counting problem associated to the patterm ; if we concatenate the and components of the pattern, this is a classic “complexity one” pattern, of the type that would be expected to be amenable to Fourier analysis (especially if one applies Cauchy-Schwarz to eliminate the averaging and absolute value, at which point one is left with the pattern ).
In the finitary setting, we used the arithmetic regularity lemma. Here, we will need to use the Kronecker factor instead. The indicator function of a level set of the coloring function is a bounded measurable function of , and can thus be decomposed into a function that is measurable on the Kronecker factor, plus an error term that is orthogonal to that factor and thus is weakly mixing in the sense that tends to zero on average (or equivalently, that the Host-Kra seminorm vanishes). Meanwhile, for any , the Kronecker-measurable function can be decomposed further as , where is a bounded “trigonometric polynomial” (a finite sum of eigenfunctions) and . The polynomial is continuous by hypothesis. The other two terms in the decomposition are merely meaurable, but can be approximated to arbitrary accuracy by continuous functions. The upshot is that we can arrive at a decomposition
(analogous to the arithmetic regularity lemma) for any , where is a bounded continuous function of norm at most , and is a bounded continuous function of norm at most (in practice we will take much smaller than ). Pulling back to , we then have
Let be chosen later. The trigonometric polynomial is just a sum of characters on , so one can find a subgroup of of index such that these polynomial are constant on each coset of for all . Then lies in some coset and lies in the coset . We then restrict to also lie in , and we will show that
outside of an exceptional event of proability , which will establish our claim because will ultimately be chosen to dependon .
The left-hand side can be written as
The coupling of the constraints and is annoying (as is an “infinite complexity” pattern that cannot be controlled by any uniformity norm), but (perhaps surprisingly) will not end up causing an essential difficulty to the argument, as we shall see when we start eliminating the terms in this sum one at a time starting from the right.
We decompose the term using (5):
By Markov’s inequality, and removing an exceptional event of probabiilty at most , we may assume that the have normalized norm on both of these cosets . As such, the contribution of to (6) become negligible if is small enough (depending on ). From the near weak mixing of the , we know that
for all , if we choose large enough. By genericity of , this implies that
From this and standard Cauchy-Schwarz (or van der Corput) arguments we can then show that the contribution of the to (6) is negligible outside of an exceptional event of probability at most , if is small enough depending on . Finally, the quantity is independent of , and in fact is equal up to negligible error to the density of in the coset . This density will be except for those which would have made a negligible impact on (6) in any event due to the rareness of the event in such cases. As such, to prove (6) it suffices to show that
outside of an event of probability . Now one can sum in to simplify the above estiamte to
If is such that is small compared with , then by genericity (and assuming large enough), the probability that will similarly be small (up to errors), and thus have a negligible influence on the above sum. As such, the above estimate simplifies to
But the left-hand side sums to one, and the claim follows.