Compressing Encrypted Data ECE 559RB Cryptography Siva Theja Maguluri

Outline • Introduction • Distributed Source Coding  Lossless Compression – Slepian Wolf  Compression with Fidelity Criterion WynerZiv

• Information Theoretic Security • Compression of Encrypted Data • Computer Simulations  Lossless Compression of binary data  Lossy Compression of real valued data

• Conclusions 5/5/2009

Siva Theja Maguluri

2

Introduction • To transmit Redundant data over an insecure bandwidth constrained channel,

• Reversing the order of Encryption and Compression

5/5/2009

Siva Theja Maguluri

3

Introduction • Compressor does not have access to the Key • At first glance, it appears that not much gain can be obtained, because encrypted data looks quiet random • But we have a joint decompression and decryption. So, decrypter has access to the key • Turns out significant compression gain can be obtained, from distributed source coding theory • In some cases, same gains as in encryption followed by compression • Application – A scenario where data is being distributed on a network 5/5/2009

Siva Theja Maguluri

4

Distributed Source Coding: Lossless • To Compress Sources Y and K that are correlated, but cannot communicate with each other. • Lossless case – Discrete Source • Special Case: K is available at the Decoder, and is correlated to Y

• Slepian-Wolf Result – H(Y/K) in both cases

5/5/2009

Siva Theja Maguluri

5

Lossless Source Coding Example • K known both at encoder and decoder

• Y and K uniformly distributed binary length 3 • Y and K differ in at most one position ie Hamming distance 1 • Encoder transmits index of error e=Y+K {000,001,010, 100} – 2 bits

5/5/2009

Siva Theja Maguluri

6

Example Continued… • K known only at the decoder • Encoder cant find e – but that is not necessary • Do not differentiate between 000 and 111 etc • Cosets of repetition code – cover the entire space • Use index of Coset as encoding – 2 bits again

5/5/2009

Siva Theja Maguluri

7

Example Continued… • Suppose X is a random variable taking values on {000, 001, 010, 100} and K is a one time pad and Y = X+K • Hamming distance between Y and K is at most 1 • Can use this construction to compress Y to 2 bits, since decoder has access to K, it can decode Y and get X=Y+K • In general case, partition the space into cosets associated with the syndromes of the principal underlying channel (repetition code here) • Encoding – Compute Syndrome corresponding to the Channel code • Channel code – Choose depending on correlation between Y and K • Decoding – Identify closest codeword to K in the coset corresponding to the transmitted Syndrome

5/5/2009

Siva Theja Maguluri

8

Distributed Source Coding: Lossy • Wyner- Ziv extends Slepian- Wolf to the case of lossy coding with distortion measure • Discrete or Continuous • We will focus on Real Line with mean square error

5/5/2009

Siva Theja Maguluri

9

Compression with Fidelity Criterion - Example • Y- Uniformly distributed on [- 9δ/2,9δ/2] • Side Information K such that |Y-K|< δ

• Encoder will quantize Y to Y’ with step size δ . |Y-Y’|≤ δ/2 • This can be thought of as three interleaved quantizers (cosets) of size 3δ • Encoder transmits the label of the coset – log3 bits Y '− K ≤ Y '−Y + Y − K <

5/5/2009

δ 3δ +δ = 2 2

Siva Theja Maguluri

10

Example Continued… • Decoder finds the reconstruction level closest to K with same label and decodes Y

• log3 bits for reconstruction with in δ/2. In absence of K, it would have been log9 bits • Performance can be improved using more complex alternatives.

5/5/2009

Siva Theja Maguluri

11

Information Theoretic Security • General Secret Key Cryptosystem • WLOG discrete iid source • Block length n • Key independent of the source, uniformly distributed • Noiseless insecure public channel • Rate, R – bits per symbol,

5/5/2009

Siva Theja Maguluri

12

Performance Measures • Measure of Secrecy against Eavesdropper  Shannon-sense perfect secrecy I ( X ; B) = 0 I ( X ; B)  Wyner sense perfect secrecy lim =0 n  Maurer sense perfect secrecy lim I ( X ; B) = 0 n

n

• Measure of fidelity of the decoder i.e. expected distortion • Number of bits per source symbol, R • Number of bits per source symbol of the secret key – cardinality of key space

5/5/2009

Siva Theja Maguluri

13

Tradeoff between performance parameters

• RX(D) is the rate distortion function • Shannon Cryptosystem – Achieves these bounds

5/5/2009

Siva Theja Maguluri

14

Compression of Encrypted Data • Define XOR on general alphabet x⊕ y = y⊕ x x⊕z = y⊕z ⇒ x = y

• Reversed Cryptosystem

5/5/2009

Siva Theja Maguluri

15

5/5/2009

Siva Theja Maguluri

16

Performance Limit

• It can also be shown that this is the best possible performance for a system having this kind of structure

5/5/2009

Siva Theja Maguluri

17

Performance limit • For finite alphabets, it is possible to guarantee the stronger notion of Shannon sense perfect secrecy by sacrificing key efficiency (R’), by letting K be distributes uniformly over alphabet of X • How much compression can be achieved if the encryption scheme is pre specified? • When source is required to be reproduced at the decoder losslessly, by Slepian Wolf Theorem, one can compress up to the entropy rate of unencrypted source, without compromising on the security 5/5/2009

Siva Theja Maguluri

18

Special Cases

5/5/2009

Siva Theja Maguluri

19

Simulations – lossless compression of binary data • Binary source with empirical entropy .37 bits per pixel • Encrypt using pseudorandom Bernouli(1/2) string • Encrypted data has 1 bit/pixel empirical entropy • Incompressible if no side information • Compress by finding syndrome using a rate ½ LDPC code

5/5/2009

Siva Theja Maguluri

20

Simulations

• Modify the iterative decoding algorithm • At check nodes to take syndrome into account • Initialize with the knowledge of key and it’s correlation to the encrypted string • Decryption is trivial after decoding 5/5/2009

Siva Theja Maguluri

21

5/5/2009

Siva Theja Maguluri

22

Simulation – Lossy compression of Real valued data • iid Gaussian sequence, with variance 1 • Encrypted using a stream cipher • Key – iid gaussian, independent of data • Each sample is quantized, and the levels are labeled with 4 labels. This gives a sequence of binary digits, double the original length • Compressed, by finding the syndrome wrt a rate ½ trellis code – effectively, 1 bit/sample • Decoder finds the closest real valued sequence to the key which gives same syndrome • It then combines this sequence with the key sequence to get an optimal estimate of the encrypted data

5/5/2009

Siva Theja Maguluri

23

5/5/2009

Siva Theja Maguluri

24

Conclusions • Seen the possibility of compressing encrypted data without the knowledge of the key • Inspired by Distribution Source Coding Principles • In some cases, can be compressed to the same extent as the original unencrypted data

5/5/2009

Siva Theja Maguluri

25

References • K Ramachandran, V Prabhakaran et al, “On Compressing Encrypted Data”, IEEE Trans on Signal Proc vol 52 No. 10, pp2992-3006 Oct 2004 • S. S. Pradhan and K. Ramchandran, “Distributed source coding using syndromes (DISCUS): Design and construction,” IEEE Trans. Inform.Theory, vol. 49, pp. 626–643, Mar. 2003. • D. Slepian and J. K. Wolf, “Noiseless coding of correlated information sources,” IEEE Trans. Inform. Theory, vol. IT-19, pp. 471–480, July 1973. • A.Wyner and J. Ziv, “The rate-distortion function for source coding with side information at the decoder,” IEEE Trans. Inform. Theory, vol. IT-22, pp. 1–10, Jan. 1976. 5/5/2009

Siva Theja Maguluri

26

Questions?

5/5/2009

Siva Theja Maguluri

27

Linear Error Correcting Codes • Code C is a linear subspace of vector space on the finite field • G, generating matrix of the code, [Ik|A] – size(k,n) • Maps any k length vector into a n length vector in C, xTG T • H, Parity Check Matrix [-A |In-k ]

• Hx = 0 for x in C • If z= x+e, Hz=He – called Syndrome of z • Minimum hamming distance d between codewords • Syndrome decoding of Linear codes is efficient

5/5/2009

Siva Theja Maguluri

28

Lossless Compression – Slepian Wolf

May 5, 2009 - decoder has access to K, it can decode Y and get X=Y+K. • In general case, partition the space into cosets associated with the syndromes of the principal underlying channel. (repetition code here). • Encoding – Compute Syndrome corresponding to the. Channel code. • Channel code – Choose depending ...

822KB Sizes 2 Downloads 86 Views

Recommend Documents

Universal lossless data compression algorithms
2.7 Families of universal algorithms for lossless data compression . . 20 .... A full-length movie of high quality could occupy a vast part of a hard disk.

EBOOK Lossless Compression Handbook - Khalid ...
Aug 15, 2002 - *Invaluable resource for engineers dealing with image processing, signal processing, multimedia systems, wireless technology and more.

Universal lossless data compression algorithms
4.1.3 Analysis of the output sequence of the Burrows–Wheeler transform . .... main disadvantages of the PPM algorithms are slow running and large memory.

level-embedded lossless image compression
information loss, in several areas–such as medical, satellite, and legal imaging– lossless ..... tation of picture and audio information - progressive bi-level.

Lossless Value Directed Compression of Complex ... - Semantic Scholar
School of Mathematical and Computer Sciences (MACS). Heriot-Watt University, Edinburgh, UK. {p.a.crook, o.lemon} @hw.ac.uk .... 1In the case of a system that considers N-best lists of ASR output. 2Whether each piece of information is filled, ...

“Lossless Value Directed Compression of Complex User Goal States ...
Real user goals vs. simplified dialogue state ... price { budget, mid-range, expensive }, location { city centre, ... } Current POMDP Systems. 6 ... Data driven:.

Lossless Value Directed Compression of Complex ... - Semantic Scholar
(especially with regard to specialising it for the compression of such limited-domain query-dialogue SDS tasks); investigating alternative methods of generating ...

Anchors-based lossless compression of progressive triangle meshes
PDL Laboratory, National University of Defense Technology, China. [email protected] ..... Caltech Multi-Res Modeling Group. References. [1] P. Alliez ...

Factorization-based Lossless Compression of ... - Research at Google
A side effect of our approach is increasing the number of terms in the index, which ..... of Docs in space Θ. Figure 1 is an illustration of such a factor- ization ..... 50%. 60%. 8 iterations 35 iterations. C o m p re ssio n. R a tio. Factorization

A Lossless Color Image Compression Architecture ...
Abstract—In this paper, a high performance lossless color image compression and decompression architecture to reduce both memory requirement and ...

Factorization-based Lossless Compression of Inverted ...
the term-document matrix, resulting in a more compact inverted in- ... H.3.1 [Information Storage And Retrieval]: Indexing methods .... an approximate solution.

Gray-level-embedded lossless image compression
for practical imaging systems. Although most ... tion for the corresponding file size or rate. However ... other values generalize this notion to a partition- ing into ...

Data Compression
Data Compression. Page 2. Huffman Example. ASCII. A 01000001. B 01000010. C 01000011. D 01000100. E 01000101. A 01. B 0000. C 0001. D 001. E 1 ...

Howlin Wolf, Howlin Wolf
Heroes ofmightand magic VII update.Dvd learn piano.47946893954 - Download HowlinWolf, HowlinWolf.The hunted digital. ... Fumettitopolino pdf.Empire 720 ... I forever shallask Jack's MumHowlinWolfIcan carry outafewactivities with himand.

Oscillatory chest compression device
Jan 14, 2002 - (Commued). 602/ 13. See application ?le for complete search history. ..... F. Ohnsorg, “A Cost Analysis of HighiFrequency Chesti ..... mucous and other secretions to build up in a person's lungs. .... mobile unit shoWn in FIG.

Oscillatory chest compression device
Jan 14, 2002 - (63) Continuation of application No. 08/661,931 ... See application ?le for complete search history. Primary ..... Product Brochure, “Percussionaire® Corporation Presents .... Generator 3 may be con?gured as a mobile unit.

Oscillatory chest compression device
Jan 14, 2002 - N. Gavriely et al., “Gas Exchange During Combined High and LoW Frequency Tidal Volume Ventilation in Dogs,” in. Progress in Respiration ...

Protection of compression drivers
maintaining a good degree of protection. 2. Somewhat smaller capacitor values may be required for additional protection in high—pa war sound reinforcement.

Compression
processing sequence in which the dimensional state of the system can be seen to become ..... Bang: when the universe collapses back onto itself. This is a ...