A generalized Cauchy-Schwarz inequality via the Gibbs variational formula

Let be a non-empty finite set. If is a random variable taking values in , the Shannon entropy of is defined as
There is a nice variational formula that lets one compute logs of sums of exponentials in terms of this entropy:
Lemma 1 (Gibbs variational formula) Let be a function. Then

Proof: Note that shifting by a constant affects both sides of (1) the same way, so we may normalize . Then is now the probability distribution of some random variable , and the inequality can be rewritten as
But this is precisely the Gibbs inequality. (The expression inside the supremum can also be written as , where denotes Kullback-Leibler divergence. One can also interpret this inequality as a special case of the Fenchel–Young inequality relating the conjugate convex functions and .)

In this note I would like to use this variational formula (which is also known as the Donsker-Varadhan variational formula) to give another proof of the following inequality of Carbery.
Theorem 2 (Generalized Cauchy-Schwarz inequality) Let , let be finite non-empty sets, and let be functions for each . Let and be positive functions for each . Then
where is the quantity
where is the set of all tuples such that for .

Thus for instance, the identity is trivial for . When , the inequality reads

which is easily proven by Cauchy-Schwarz, while for the inequality reads

which can also be proven by elementary means. However even for , the existing proofs require the “tensor power trick” in order to reduce to the case when the are step functions (in which case the inequality can be proven elementarily, as discussed in the above paper of Carbery).

We now prove this inequality. We write and for some functions and . If we take logarithms in the inequality to be proven and apply Lemma 1, the inequality becomes

where ranges over random variables taking values in , range over tuples of random variables taking values in , and range over random variables taking values in . Comparing the suprema, the claim now reduces to
Lemma 3 (Conditional expectation computation) Let be an -valued random variable. Then there exists a -valued random variable , where each has the same distribution as , and

Proof: We induct on . When we just take . Now suppose that , and the claim has already been proven for , thus one has already obtained a tuple with each having the same distribution as , and
By hypothesis, has the same distribution as . For each value attained by , we can take conditionally independent copies of and conditioned to the events and respectively, and then concatenate them to form a tuple in , with a further copy of that is conditionally independent of relative to . One can the use the entropy chain rule to compute

and the claim now follows from the induction hypothesis.

With a little more effort, one can replace by a more general measure space (and use differential entropy in place of Shannon entropy), to recover Carbery’s inequality in full generality; we leave the details to the interested reader.

A generalized Cauchy-Schwarz inequality via the Gibbs variational formula

LEAVE A REPLY Cancel reply

Marty Guy Fink featured on Carl White & Steve Kyles Podcasts

Mortgage Rates Could Fall Another Half Point Just from Market Normalization

Goldman Sachs loses profit after hits from GreenSky, real estate

Fundamental Counting Principle

Regions in the Complex Plane

Gödel incompleteness, graduate course, Notre Dame, Fall 2024

More like this

Goldman Sachs loses profit after hits from GreenSky, real...

Fundamental Counting Principle

The Battlestar Galactica Theory of Math Education – Math...

Modal title

Modal title

A generalized Cauchy-Schwarz inequality via the Gibbs variational formula

LEAVE A REPLY Cancel reply

Marty Guy Fink featured on Carl White & Steve Kyles Podcasts

Mortgage Rates Could Fall Another Half Point Just from Market Normalization

Goldman Sachs loses profit after hits from GreenSky, real estate

Fundamental Counting Principle

Regions in the Complex Plane

Gödel incompleteness, graduate course, Notre Dame, Fall 2024

More like this

.tdi_146{margin-bottom:10px!important} Fundamental Counting Principle

.tdi_168{margin-bottom:10px!important} The Battlestar Galactica Theory of Math Education – Math...

Fundamental Counting Principle

The Battlestar Galactica Theory of Math Education – Math...