1
The Hidden Flow of Information Chung Chan
Abstract—An information identity is proven, equating the secrecy capacity of the multiterminal secret key agreement problem and the throughput of certain undirected network. As a consequence, network coding can be used for secret key agreement while secrecy capacity characterizes the network throughput. A meaningful notion of mutual dependence is established with the combinatorial interpretation of partition connectivity.
˜1 .X .
˜1 .Z .
.
X .˜ 1 .
. .Z ˜2
. ˜ 3 := 0 .X
(a) Source s := 1
.
. X .˜ 2
. ˜ 3 := 0 .X
(b) s := 2
.
. X .˜ 2
. ˜3 .Z
(c) No flow possible
I. I NTRODUCTION Fig. 1.
Possible flows between 1 and 2 in an undirected network.
Let Z1 and Z2 be two independent and uniformly random bits. If their mod-2 sum Z3 := Z1 + Z2 is revealed, both bits remain uniformly random. However, they become completely dependent in the sense that Z1 can be recovered from Z2 and vice versa with the knowledge of Z3 . In other words, Z1 and Z2 share one bit of information in the presence of Z3 . We want to capture this notion of mutual dependence in general. Let ZV := (Zi : i ∈ V ) be a random vector indexed by a finite set V . Can we measure the amount of information shared among the random variables in ZA with the help of ZAc for some subset A ⊆ V : |A| ≥ 2? Is there a unifying notion of mutual dependence having different operational meanings? Consider the framework of secret key agreement [3]. Suppose Z1 , Z2 and Z3 defined earlier are random sources observed by user 1, 2 and 3 respectively. If user 3 reveals Z3 publicly to everyone, then user 2 can recover Z1 = Z2 + Z3 . Since Z1 remains uniformly random to a wiretapper who observes only the public message Z3 , it can be used as a secret key bit shared between user 1 and 2. Where does this secret key bit come from? Since user 1 and 2 agree on 2 bits of information, namely (Z1 , Z2 ), after only 1 bit of public discussion, there must be 1 bit of information not obtained directly from the public message. This bit of secret information has to come from the correlation between Z1 and Z2 enhanced by Z3 . More generally, suppose user i observes some random source Zi for i ∈ V . Allow them to discuss publicly until the set A of active users can share some common secret key bits. The maximum key rate is called the secrecy capacity. It is the additional information agreed upon by the active users but cannot be obtained directly from the public discussion. In this sense, secrecy capacity naturally captures the desired notion of mutual dependence. Consider the 3-user network again. Suppose user 1 and 3 ˜ 1 + Z1 and F3 := X ˜ 3 + Z3 reveals the public messages F1 := X ˜ ˜ respectively for some arbitrary inputs (X1 , X3 ) independent of (Z1 , Z2 , Z3 ). Since (F1 , F3 ) remains uniformly random ˜ 1, X ˜ 3 ), one cannot learn regardless of the realization of (X anything about the inputs from the public messages. However, ˜ 2 := X ˜1 + X ˜ 3 as F1 + F3 + Z2 . user 2 can recover the sum Z
Given random variables Z1 and Z2 with joint distribution PZ1 Z2 , we write H(Z1 ) = E[− log PZ1 (Z1 )] as the entropy of Z1 and H(Z1 |Z2 ) = H(Z1 Z2 ) − H(Z2 ) as the conditional entropy given Z2 . It is well-known in information theory [4] that Z1 and Z2 share a mutual information I(Z1 ∧Z2 ) defined as the divergence D(PZ1 Z2 ∥PZ1 PZ2 ) = H(Z1 ) + H(Z2 ) − H(Z1 Z2 ). The main result is the following identity equating different interpretations of the same notion of mutual dependence in the multivariate case involving two or more random variables.
Chung Chan (
[email protected],
[email protected]) is with the Institute of Network Coding, the Chinese University of Hong Kong. The manuscript is available online in [1], covering the related work in [2].
Theorem 1 Consider a random vector ZV := (Zi : i ∈ V ) indexed by a finite ground set V . For all A ⊆ V : |A| ≥ 2
There is effectively a private transmission link as shown in ˜ 1 uniformly random while user 3 Fig. 1(a). If user 1 makes X ˜ 3 := 0, user 2 can recover X ˜ 1 from Z ˜2 = X ˜ 1 + 0. i.e. sets X there is one bit of secret information flow from the source node s := 1 to the sink node 2, and so user 1 and 2 agree on a secret key bit. Alternatively, by symmetry, one can obtain a transmission link with inputs from user 2 and 3 (Fig. 1(b)) or from user 1 and 2 (Fig. 1(c)). In the first case, one secret key bit is still achievable with user 2 acting as the source. This cannot be done for the second case because neither user 1 nor 2 observes from the link. In general, we have a network coding problem over an undirected network: 1) turn some given private source ZV into an effective private channel with certain orientation, and 2) have each user i ∈ V multicast independent uniformly random bits to all active∑users in A at some rate Ri . Can the total information flow i∈V Ri into A reach the secrecy capacity? Is it enough to have just one source s ∈ A, i.e. Ri = 0 for i ̸= s? We will show that the answers are affirmative due to the combinatorial structure of information flow. Consequently, there are different characterizations and interpretations for the same notion of mutual dependence.
II. M AIN RESULT
2 .x2 , z2
in particular. Fig. 2(b) is the corresponding network with inputs located at X1 := Z1 and X2 := Z2 . (X3 is set to a constant implicitly.) More precisely, for any C ⊆ V , we have H(ZC ) − x(C) = H(ZC |XC ), which can be interpreted as the amount of information flow into C. The possible choices of B for (1b) are {1} and {1, 3}, while the choices of P for (1c) are {{1}, {2}}, {{1, 3}, {2}} and {{1}, {2, 3}}. All of them lead to the same optimal value of 1 bit, verifying the identity (1).2
.H (Z1 |X1 ) = 0 .X1 = Z1 .. .1 . . .0
.1
.1
.x1 , z1
.x3 , z3
.H (Z2 ) = 1
. .X3 = Z3
(a) Feasible xV to (3) and zV to (2) Fig. 2.
. . .Z2
In the next section, we will introduce the combinatorial structure underlying the main result. A more general identity for submodular functions will be given in Section IV and proven in Section V. Theorem 1 will follow from Theorem 2 with f chosen as the entropy function (12).
(b) Cut values
Example 1 with s = 1 and A = {1, 2}.
and s ∈ A, we have H(ZV ) − min z(V ) min [H(ZB c ) − x(B c )] ∑ [H(ZC ) − x(C)] = max min C∈P xV P |P| − 1
∏ ( )
D P ZV C∈P PZC = min P |P| − 1 = max xV
if A = V
where zV , xV ∈ RV (with the notation z(B) := similary for x(B), B ⊆ V ) satisfy
(1d)
B1 , B2 ∈ F =⇒ B1 ∩ B2 , B1 ∪ B2 ∈ F
(4)
For A ⊆ V , F is called A-co-intersecting if for all B1 , B2 ∈ F
∑
i∈B zi and
z(B) ≥ H(ZB |ZB c ) for all B ⊆ V : B ̸⊇ A and
(1c)
Given a finite ground set V , we denote 2V as the power set {B ⊆ V } of V , and B c as the complement V \B of the subset B ⊆ V . For a family F ⊆ 2V of subsets of V , F¯ denotes the complement {B c : B ∈ F}. F is called a lattice family if
(1b)
B⊆V :s∈B̸⊇A
x(V ) = H(ZV )
III. P RELIMINARIES
(1a)
zV
(2)
B1 ∪ B2 ̸⊇ A =⇒ B1 ∩ B2 , B1 ∪ B2 ∈ F and B ̸⊇ A for all B ∈ F . F is called a downset if B ′ ⊆ B ∈ F =⇒ B ′ ∈ F
x(B) ≤ H(ZB ) for all B ⊆ V (3)
and P ⊆ 2V satisfies |P| ≥ 2 and (C ∩ A : C ∈ P) is a partition of A into non-empty disjoint sets. The divergence ∑ D(·∥·) in (1d) equals C∈P H(ZC ) − H(ZV ). 2 (1a) is the secrecy capacity with minzV z(V ) being the smallest rate of communication for omniscience [3]. (1b) is the maximum rate that a source node s can multicast information to all nodes in A \ {s} over certain undirected network [5, Chapter 3]. (1c) is some generalized notion of partition connectivity in combinatorics [6], [7]. When A = V , (1c) simplifies to (1d) which is the information divergence from the joint distribution to the product of the marginal distributions, just like the mutual information for two random variables. (1a) ≤ (1c) was pointed out in [3] and shown to be tight in [8]. The integer version of (1b) = (1c) was shown in [5, Theorem B.2] when ZV is a finite linear source [5, Definition 3.1], extending some results in [6], [7]. (1a) = (1b) = (1c) for a general ZV or A ( V was not known before. It gives as a special case the rate region of the point-topoint undirected network in [9], [10], and more generally any undirected finite linear network. Let us return to the previous 3-user network as a concrete example. Example 1 Consider V := {1, 2, 3}, A := {1, 2}, s := 1 and the mod-2 sum Z3 := Z1 + Z2 of two independent uniformly random bits Z1 and Z2 . The feasible zV to (2) is plotted in Fig. 2(a) as the upward polyhedron with the vertices (0, 0, 1) and (1, 1, 0). The point zV = (0, 0, 1) attains the optimal value of 1 bit for (1a). The triangular plane corresponds to the feasible xV to (3). Consider xV = (1, 0, 1)
(5)
(6)
Proposition 1 F = {B ⊆∪V : B ∪ ̸⊇ A} is the unique A-cointersecting downset with F := B∈F B = V . 2 P ROOF F is a downset because B ′ ⊆ B ̸⊇ A implies B ′ ̸⊇ A. It is A-co-intersecting because B1 ∪ B2 ̸⊇ A implies B1 ∩ B2 ̸⊇ A. To prove ∪ uniqueness, consider any A-co-intersecting downset G with G = V . For any s∪∈ A and B ∈ G, we have B \ {s} ∪ ∈ G since G is a downset. B∈G B \ {s} = V \ {s} because G = V . V \{s} ∈ G because G is A-co-intersecting. Since G is a downset, it must also contain all subsets of V \{s} for any s ∈ A. In other words, G ⊇ {B ⊆ V : B ̸⊇ A}. The reverse inclusion is because G is A-co-intersecting. For A ⊆ V , define Λ(F, A) as the set of λ : F 7→ R+ with ∑ λ(B) = 1 for all i ∈ A (7) B∈F :i∈B
With |A| ≥ 2, define Π(F, A) as the set of P ⊆ F¯ such that |P| ≥ 2 and (C ∩ A : C ∈ P) is a partition of A, i.e. ∀C ∈ P, C ∩ A ̸= ∅ and ∀s ∈ A, ∃!C ∈ P, s ∈ C
(8)
Given any P ∈ Π(F, A), we can construct a λ ∈ Λ(F, A) by 1 setting λ(B) to |P|−1 if B c ∈ P and 0 otherwise. For set function f : F 7→ R and B1 , B1 ∪B2 ∈ F , we write f (B2 |B1 ) := f (B1 ∪ B2 ) − f (B1 ). This gives the chain rule k ∑
f (Bj | B1 ∪ · · · ∪ Bj−1 ) = f (B1 ∪ · · · ∪ Bk )
(9)
j=1
for all B1 , . . . , Bk ⊆ V such that B1 ∪· · ·∪Bj ∈ F for j ≤ k.
3
f is said to be submodular if f (B1 ) + f (B2 ) ≥ f (B1 ∩ B2 ) + f (B1 ∪ B2 )
(10)
for all B1 , B2 ∈ F : B1 ∩ B2 , B1 ∪ B2 ∈ F . Equivalently, we have successive conditioning reduces f f (B1 |B1 ∩ B2 ) ≥ f (B1 |B2 )
(11)
For example, the entropy function h : 2V 7→ R h(B) := H(ZB )
for B ⊆ V
(12)
for a random vector ZV is known [11] to be submodular (10) because of the fact that conditioning reduces entropy (11). For a set function f : 2V 7→ R with f (∅) ≥ 0, and a family F ⊆ 2V , define the following polyhedra { } P(f ) := zV ∈ RV : z(B) ≤ f (B) (13) ,B ⊆ V B(f ) := {xV ∈ P(f ) : x(V ) = f (V ) } (14) Q(f, F) := {zV ∈ RV
: z(B) ≥ f (B|B c ) , B ∈ F } (15)
with the convention that z(∅) = 0.1 It follows that B(f ) ⊆ Q(f, F) ∩ P(f )
(16)
To argue this, suppose xV ∈ B(f ). Then, xV ∈ P(f ) by definition. Since x(B c ) ≤ f (B c ) and x(V ) = f (V ), we have x(B) = f (V ) − x(B c ) ≥ f (B|B c ) for every B ∈ F. This implies xV ∈ Q(f, F) as desired. If f is submodular (10), P(f ) is called a submodular polyhedron while B(f ) is a non-empty set called the base polyhedron of P(f ). In particular, zV is a vertex of B(f ) iff for some strict total order ≺ on V we have zi = f (i|{≺ i}) for i ∈ V where {≺ i} := {j ∈ V : j ≺ i}. See [12, Part IV] for more details. A simple property on P(f ) is as follows. Proposition 2 Define for zV ∈ P(f ) T (f, zV ) := {B ⊆ V : z(B) = f (B)}
(17)
Then, T (f, zV ) is a lattice family (4) if f is submodular (10).2 P ROOF For B1 , B2 ∈ T (f, zV ) and zV ∈ P(f ), f (B1 ) + f (B2 ) = z(B1 ) + z(B2 ) = z(B1 ∩ B2 ) + z(B1 ∪ B2 )
by (17)
≤ f (B1 ∩ B2 ) + f (B1 ∪ B2 ) by (13) The reverse inequality holds as desired if f is submodular. For any subsets Q1 , Q2 ⊆ RV , we write (Q1 − Q2 ) := {x − y : x ∈ Q1 , y ∈ Q2 } (Q1 )+ := {z ≥ 0 : z ∈ Q1 } where z ≥ 0 means that every element of z is non-negative. For xV ∈ B(f ), define the polyhedra G(f, F, xV ) := ({xV } − Q(f, F))+ = {yV ≥ 0 : y(B) ≤ f (B c ) − x(B c ) , B ∈ F } (18) ∪ G(f, F) := ({xV } − Q(f, F))+ xV ∈B(f )
= (B(f ) − Q(f, F) ∩ P(f ))+
(19)
1 Since f (∅) ≥ 0 and f (∅|V ) = 0, the constraint with B = ∅ for P(f ) or Q(f, F) is trivially satisfied.
To explain (18), it suffices to show that for any yV = xV −zV , we have y(B) ≤ f (B c ) − x(B c ) iff z(B) ≥ f (B|B c ). This holds because for all B ∈ F, y(B) = x(B) − z(B) = [f (V ) − z(B)] − x(B c ) since x(V ) = f (V ). To prove ⊆ for (19), n.b. any yV ∈ G(f, F) by definition has an xV ∈ B(f ) and zV ∈ Q(f, F) such that yV = xV − zV ≥ 0. Thus, zV = xV − yV ∈ P(f ) because xV ∈ P(f ) by (14) and reducing xV by yV ≥ 0 does not violate any constraints of P(f ) in (13). IV. A
GENERAL IDENTITY FOR SUBMODULAR FUNCTION
Because Theorem 1 can be rewritten in terms of the entropy function h in (12), a natural question to ask is whether it holds for more general set functions. What is the fundamental property of h that gives rise to the result? It turns out that the identity (1) applies equally well to the differential entropy [4] of continuous random variables, and more generally, any submodular function f with f (∅) ≥ 0.2 Theorem 2 If we have for some A ⊆ V : |A| ≥ 2 1) F = {B ⊆ V : B ̸⊇ A}, and 2) f : 2V 7→ R is submodular (10) with f (∅) ≥ 0 then for all s ∈ A f (V ) −
min
zV ∈Q(f,F )
z(V )
(20a)
(20b) [f (B c ) − x(B c )] ∑ C∈P [f (C) − x(C)] (20c) = max min |P| − 1 xV ∈B(f ) P∈Π(F ,A) ∑ C∈P f (C) − f (V ) = min if A = V (20d) |P| − 1 P∈Π(F ,V ) = max
min
xV ∈B(f ) B⊆V :s∈B̸⊇A
where Q, B, Π are defined in (15), (14), (8) respectively. Given an optimal zV to (20a), we can obtain an optimal xV to (20b) and (20c) with xs = zs +f (V )−z(V ) and xi = zi for i ̸= s.3 2 An optimal zV , B, P to the minimizations in (20a), (20b), (20c) respectively can be computed in polynomial time by the ellipsoid method given that f can be evaluated efficiently.4 If f is integer, [5, Theorem B.2] implies the integer version of the identity that (20b) = ⌊(20c)⌋ when the maximizations are over the set of integer bases xV ∈ B(f ) ∩ ZV instead.5 Stronger statements can be made when the structure of f can be captured by a hypergraph [5, Appendix B.3]. Instead of proving Theorem 2 in one lot, we first give three lemmas which build up different parts of the desired identity. 2 In
particular, non-Shannon-type inequalities [4] are not necessary. can also be shown that an optimal xV and zV exist with xs maximized to f ({s}) and z({s}c ) minimized to f ({s}c |s). More precisely, the minimization in (20b) is not reduced by increasing xs and decreasing xj for some j ̸= s. Maximizing xs this way, we eventually have for all j ̸= s and some Bj ∋ s : j ̸∈ ∩ Bj that x(Bj ) = f (Bj ), i.e. Bj ∈ T (xV , f ). By Proposition 2, {s} = j̸=s Bj ∈ T (xV , f ), implying xs = f ({s}). 4 See [12, Chapter 5.11]. It can be shown that (20b) is a submodular function minimization (SFM), (20a) has a separation oracle that corresponds to some SFM, and (20c) can also be solved using a linear program whose dual program has a separation oracle that corresponds to some SFM. 5 If f and (20a) are integer and A = V , it follows that an integer optimal zV exists. See also [12, Corollary 49.3a] for a related result. 3 It
4
that F ⊇ 2V \S , we have B c ∈ F. zV ∈ Q(f, F) implies z(B c ) ≥ f (B c |B) = f (V ) − f (B) as desired. Consider the alternative case when f is submodular but S can be any non-empty subset of V . Consider any yV ≥ 0 : y(S c ) = 0 satisfying ∪ xV := zV + yV ∈ P(f ) and T := T (xV , f ) ⊇ S
Lemma 1 If for some S ⊆ V : S ̸= ∅ 1) F ⊇ 2V \S , and 2) |S| = 1 or f is submodular with f (∅) ≥ 0, then f (V ) − min. z(V ) = max. y(V ) zV ∈Q(f,F )∩P (f )
yV ∈G(f,F )
=
max.
yV ∈G(f,F )
y(S)
Every optimal yV to the last maximization has y(S c ) = 0. Lemma 2 If we have 1) F ⊆ 2V is a downset (6), and 2) f is submodular with f (∅) ≥ 0, then min. z(V ) = min. zV ∈Q(f,F )∩P(f )
zV ∈Q(f,F )
2
z(V )
Any finite optimal zV to the right is also optimal to the left.2 Lemma 3 (See [5, Theorem B.1]) If for A ⊆ V : |A| ≥ 2 1) F is A-co-intersecting (5), and 2) g : F 7→ R is submodular, then ∑ ∑ 1 min. λ(B)g(B) = min. g(C c ) λ∈Λ(F ,A) P∈Π(F ,A) |P| − 1 B∈F
(a)
x(V ) = x(T ) + x(T c ) = f (T ) + x(T c ) (b)
≥ f (T ) + f (T c |T ) = f (V )
C∈P
where Λ, Π are defined in (7), (8) respectively.6
2
Lemma 2 and 3 together require that F is an A-cointersecting downset. By Proposition 1, F in Theorem 2 is indeed the only choice that covers all elements in V . This choice also satisfies Lemma 1 for any S ⊆ A. With the help of the strong duality theorem [13] for linear programs, Lemma 1 and 2 will establish (20a) = (20b) with S = {s} while Lemma 3 will give (20c). The details are as follows. V. P ROOFS P ROOF (L EMMA 1) As yV ≥ 0 for yV ∈ G(f, F), max.
yV ∈G(f,F )
y(S) ≤
max.
y(V )
max.
[x(V ) − z(V )]
yV ∈G(f,F )
(a)
≤
xV ∈B(f ) zV ∈Q(f,F )∩P(f ) (b)
= f (V ) −
and
min.
zV ∈Q(f,F )∩P(f )
z(V )
zV + yV ∈ B(f )
because the prior implies y(S) = y(V ) and latter implies y(V ) = f (V ) − z(V ) and yV ∈ G(f, F) as desired. Consider the case |S| = 1 first. Define yV with yi := f (V ) − z(V ) for i ∈ S and 0 otherwise. Since yV ≥ 0, y(S c ) = 0 and y(V ) + z(V ) = f (V ), it remains to prove that z(B) + y(B) ≤ f (B) for B ⊆ V . By definition of yV , z(B) + y(B) equals f (V ) − z(B c ) for B ⊇ S and it equals z(B) otherwise. Since zV ∈ P(f ) implies z(B) ≤ f (B), we need only focus on the case B ⊇ S and prove that f (V ) − z(B c ) ≤ f (B). By the assumption 6 See
(a) is because x(T ) = f (T ) by Proposition 2. (b) is because zV ∈ Q(f, F) implies xV ∈ Q(f, F), and T ⊇ S implies T c ∈ F by the assumption that F ⊇ 2V \S . Thus, x(T c ) ≥ f (T c |T ). Finally, xV ∈ P(f ) implies that the inequality is satisfied with equality, and so xV = zV + yV ∈ B(f ). P ROOF (L EMMA 2) Consider the non-trivial case when the minimum is finite.8 We want to prove that any optimal zV ∈ Q(f, F) that minimizes z(V ) is also in P(f ). Suppose to the contrary that an optimal zV has for some U ⊆ V that z(U c ) > f (U c ). To reach the desired contradiction, we will construct zV′ ∈ Q(f, F) with z ′ (V ) < z(V ). Define zV′ with zi′ := f (i|{≺ i} \ U )
for i ∈ U c
′ for some strict total order ≺ on V , and zU chosen as an optimal ′ solution that minimizes z (U ) subject to
z ′ (B) ≥ f (B|B c )
for all B ∈ F : B ⊆ U
z(U c ) = f (U c ) by (9) and z ′ (U ) ≤ z(U ) by optimality. Thus,
(a) and (b) follow from (19) and x(V ) = f (V ) by (14). It remains to prove the reverse inequality, i.e. for every zV ∈ Q(f, F) ∩ P(f ) there exists some yV ∈ G(f, F) such that y(S) = y(V ) = f (V ) − z(V ). It suffices to find some yV ≥ 0 y(S c ) = 0
where T is defined in (17). Such yV can be constructed greedily as follows.7 Starting with yV = 0, we have xV = zV ∈ P(f ). If there exists i ∈ S \ T , i.e. x(B) < f (B) for all B ∋ i, then increase yi until x(B) = f (B) for some B ∋ i. For this new yV , we still have xV ∈ P(f ) but T increases strictly to contain i. Repeating this procedure, we eventually obtain the desired yV with S \ T = ∅. n.b.
also [12, Theorem 49.3] for a related result.
z ′ (V ) = f (U c ) + z(U ) < z(V ) by the assumption that z(U c ) > f (U c ). It remains to show that zV′ ∈ Q(f, F), i.e. z ′ (B) ≥ f (B|B c ) for all B ∈ F . ∑ z ′ (B \ U ) = f (i|{≺ i} \ U ) i∈B\U
≥
∑
f (i|{≺ i} ∩ (B \ U ) ∪ B c )
by (11)
i∈B\U
which equals f (B \ U |B c ) by (9). Since B ∩ U ∈ F by the assumption that F is a downset, we have z ′ (B ∩ U ) ≥ 7 Indeed, the set of all possible y is the base polyhedron B(g) ⊆ RS S + with ground set S and submodular function g : 2S 7→ R defined as g(B ′ ) : = minB ′ ⊆B⊆V [f (B) − z(B)] for B ′ ⊆ S. 8 Consider the degenerate case when min. zV ∈Q(f,F ) z(V ) = −∞. Since V is finite, there exists j ∈ V such that j ̸∈ B for all B ∈ F . Pick any yV ≥ 0 : y(V \ {j}) = 0 and xV ∈ B(f ). This is possible because B(f ) ̸= ∅ for f is submodular with f (∅) ≥ 0. By (16), zV := xV − yV is in Q(f, F) ∩ P(f ) but z(V ) = f (V ) − yj , which can be arbitrarily small by choosing yj > 0. Hence, min.zV ∈Q(f,F )∩P(f ) z(V ) = −∞ as desired.
5
′ f (B ∩ U |B c ∪ (B \ U )) by the definition of zU . Hence,
VI. C ONCLUSION
z ′ (B) = z ′ (B \ U ) + z ′ (B ∩ U ) ≥ f (B \ U |B c ) + f (B ∩ U |B c ∪ (B \ U )) which equals f (B|B c ) by (9) as desired.
P ROOF (T HEOREM 2) By ∑ (8), every P ∈ Π(F, V ) is a set partition of V and so C∈P x(C) = x(V ) = f (V ). This gives (20d) when A = V . For any s ∈ A, we have F ⊇ 2V \{s} , which satisfies the premise of Lemma 1 with S = {s}. The premises of Lemma 2 are also satisfied by Proposition 1. By Lemma 1, 2 and (19), f (V ) −
min
zV ∈Q(f,F )
z(V ) =
max
yV ∈G(f,F )
= max
ys ACKNOWLEDGMENT max
xV ∈B(f ) yV ∈G(f,F ,xV )
ys
(21)
which is finite since zV ∈ Q(f, F) implies for all i ∈ V that c zi ≥ f ({i}|{i} ) ≥ 0. Consider some optimal zV and yV . By optimality, ys = f (V ) − z(V ) and y(V \ {s}) = 0. Then, zV + yV gives an optimal xV as desired because it is in B(f ) by the proof of Lemma 1. (20b) follows directly from (21) as (a)
ys =
min
B∈F :s∈B
(b)
=
[f (B c ) − x(B c )]
min
B⊆V :s∈B̸⊇A
[f (B c ) − x(B c )]
(b) is by the definition of F. ≤ for (a) follows from (18). Since xV ∈ B(f ) implies that f (B c ) − x(B c ) ≥ 0, equality for (a) can be achieved by setting yi = 0 for i ̸= s. To prove (20c), n.b. by the strong duality theorem [13] ∑ λ(B) [f (B c ) − x(B c )] ys = min λ
B∈F
∑ where λ : F∑7→ R+ satisfies B∋s λ(B) ≥ 1. By restricting λ to satisfy B∋i λ(B) = 1 for all i in some U ⊆ V : s ∈ U , ys ≤ b(U ) :=
∑
min λ∈Λ(F ,U )
λ(B) [f (B c ) − x(B c )]
B∈F
where Λ(F, U ) is defined in (7). As b(U ) is non-decreasing, we have ys ≤ b(A) ≤ b(V ). We will show that b(A) and b(V ) simplifies to (20c) and (20a) respectively, and so (21) implies that ys = b(A) = b(V ) as desired. b(A) equals (20c) by Lemma 3 because F and g(B) := f (B c ) − x(B c ) for B ∈ F satisfy the premise by Proposition 1 and the∑ submodularity of c f∑. Rewriting g(B) as x(B) − f (B|B ) and B λ(B)x(B) as ∑ x λ(B), i i∈V B∋i [ ] ∑ ∑ ∑ c b(V ) = min xi λ(B) − λ(B)f (B|B ) λ∈Λ(F ,V )
= f (V ) − ∑
i∈V
max λ∈Λ(F ,V )
B∋i
∑
In sports, players in one team often signal for an attack without their opponents knowing it. Underpinning this hidden flow of information is the tacit understanding among the players developed over a long period of time. In a way, this intuitive notion of mutual dependence is captured in a mathematical setting by relating secret key agreement to information flow in an undirected network. A better understanding of the underlying combinatorial structure is likely to result from further studies on the secret key agreement problem under different agreement and secrecy criteria, and different models for the public and private observations.
B∈F
λ(B)f (B|B c )
B∈F
because B∋i λ(B) = 1 and x(V ) = f (V ). This equals (20a) by the strong duality theorem as desired.
The author would like to thank his colleagues at INC, Professor Sidharth Jaggi, Robert Li, Anthony So, Raymond Yeung and Angela Zhang for their valuable comments. R EFERENCES [1] C. Chan, publications. http://chungc.net63.net/pub, http://goo.gl/4YZLT. [2] ——, “The hidden flow of information,” in 2011 IEEE International Symposium on Information Theory Proceedings (ISIT2011), St. Petersburg, Russia, Jul. 2011, see [1]. [3] I. Csisz´ar and P. Narayan, “Secrecy capacities for multiple terminals,” IEEE Transactions on Information Theory, vol. 50, no. 12, Dec 2004. [4] R. W. Yeung, Information Theory and Network Coding. Springer, 2008. [5] C. Chan, “Generating secret in a network,” Ph.D. dissertation, Massachusetts Institute of Technology, 2010, see [1]. [6] A. Frank, T. Kir´aly, and M. Kriesell, “On decomposing a hypergraph into k-connected sub-hypergraphs,” Discrete Applied Mathematics, vol. 131, no. 2, pp. 373–383, September 2003. [7] J. Bang-Jensen and S. Thomass´e, “Decompositions and orientations of hypergraphs,” Preprint no. 10, Department of Mathematics and Computer Science, University of Southern Denmark, May 2001. [8] C. Chan and L. Zheng, “Mutual dependence for secret key agreement,” in Proceedings of 44th Annual Conference on Information Sciences and Systems, 2010, see [1]. [9] Z. Li and B. Li, “Network coding in undirected networks,” in Proceedings of 38th Annual Conference on Information Sciences and Systems (CISS), 2004. [10] J. Goseling, C. Fragouli, and S. N. Diggavi, “Network coding for undirected information exchange,” IEEE Communications Letters, vol. 13, no. 1, January 2009. [11] S. Fujishige, “Polymatroidal dependence structure of a set of random variables,” Information and Control, vol. 39, no. 1, pp. 55–72, 1978. [12] A. Schrijver, Combinatorial Optimization: Polyhedra and Efficiency. Springer, 2002. [13] G. B. Dantzig and M. N. Thapa, Linear Programming. 1: Introduction. Springer-Verlag New York, 1997-2003.