Steganography: Data Hiding in Images

Viewer
Transcript

Steganography: Data Hiding in Images Brijesh Pillai, Prashanth Govindaraju Department of Electrical & Computer Engineering, Clemson University Abstract

C TE

D

Steganography is the practice of writing hidden messages in such a way that no one apart from the intended recipient even knows that a message has been sent. It is one of the oldest applications of image processing in the area of secure communication dating back to ancient Greek warfare. We look into various methods adopted to hide data within images. The paper mainly discusses substitution-based and transform domain algorithms for data hiding and their implementation. We also extend the discussion to a channel based approach that yields higher information hiding capacity. Finally, we compare the robustness and scope of these techniques. Keywords Steganography, Image hiding, LSB substitution

3. Steganographic Techniques

re s

Pure steganographic systems require no prior exchange of information. The secret-key techniques assume the knowledge of a shared stego-key between sender and receiver whereas public-key steganography is analogous to publickey cryptography. Substitution, transformdomain, statistical, spread spectrum and distortion based techniques are some of the approaches available for developing steganographic systems. The most common approaches are substitution based where redundant bits within the cover image are used to embed the secret information. However, these methods fail when the image gets scaled or compressed. This leads to transform domain and channel based techniques that are robust to compression and various other image transformations.

C O PY

R

ht s

IG Al H lr T ig S

PR

Steganography is the art and science of communicating in a way which hides the existence of the communication. History of it dates back to 440 BC when Greeks used it for espionage in Prussia. Tristhemus’ Steganographia (Greek for ‘covered writing’) published in 1606 is one the earliest documented material on this subject. Steganography is often confused with digital watermarking and cryptography. Cryptography concentrates more on the security or encryption of data whereas Steganography aims to defeat the knowledge of encryption of the message. Watermarking is about protecting the content in images; Steganography is all about concealing the very existence of hidden information. The paper looks into more recent work in this field, wherein we discuss couple of substitution-based algorithms and a frequency domain approach. Finally we present our technique for extending an existing channel based approach for increased robustness against detection.

centered in espionage and warfare strategies [7]. However, recently projects like DICOM have extended the applications to medical imaging where it is being used to protect medical records.

er O ve T E d

1. Introduction

2. Steganography Steganography studies the ways to make communication invisible by hiding secrets in innocuous messages, mainly images. Images form a classic instance of covert channel as they are most difficult to detect for the existence of hidden data. Data is hidden within an image often referred to as a cover image, as it acts as a cover to the real information. There is no significant change in the appearance of the cover image even after embedding the information. Applications of steganography are mainly

A. Substitution-based Techniques Substitution based steganographic techniques make use of / involve, substituting redundancy in an image. Redundant parts of a cover image are replaced with a secret message. Least Significant Bit substitution involves substituting the least significant bit of each pixel in the cover by the message bit. The embedding process consists of choosing a subset of pixels Cj in the cover image and replacing the LSB of that

D

C TE

During the encoding process, the sender splits the cover image into 8X8 pixel blocks; each block encodes exactly one message bit. Before the communication starts, both sender and receiver have to agree on the location of two DCT coefficients, which will be used in the embedding process; Let us denote these two indices by (u1, v1) and (u2, v2). The two coefficients should correspond to cosine functions with middle frequencies; this ensures that information is stored in significant parts of the image.

re s

ht s

IG Al H lr T ig S

PR

In the simplest method for selecting the set of pixels in which to hide the message, the sender uses all the cover pixels for hiding, starting at the first pixel in the image until the entire message has been hidden. There will also be a predefined special character to indicate end of message. The receiver can then sequentially access pixels in the image and extract the LSB of each pixel. He repeats this process until the special character is encountered. This type of techniques can lead to serious problems in the sense that the first part of the image where the message is hidden will have different statistical characteristics from the second part, where no changes have been made. This makes it easy for the existence of hidden data to be detected. One possible solution to this is to use a pseudo random sequence generator seeded with a secret key shared between the sender and the receiver to generate a pseudo random sequence of indices of pixels which are to be modified to store the secret message.

destroy the hidden data. Embedding information in the frequency domain can be much more robust than in the time or spatial domain. It is generally preferable to hide information in noisy regions and edges of images, rather then in smoother regions. The benefit is two-fold; Degradation in smoother regions of an image is more noticeable to the human eye, and becomes a prime target for lossy compression schemes. The Discrete Cosine Transform allows an image to be broken up into different frequency bands, making it much easier to embed watermarking information into the middle frequency bands of an image. The middle frequency bands are chosen as they have the most visually important parts of the image (low frequencies) without over-exposing themselves to removal through compression or noise attacks (high frequencies) [9].

er O ve T E d

pixel value with the message bit mi. In order to decode the secret message, the receiver must have access to the sequence of element indices used in the embedding process. A fact to be noted is that more than one LSB can be used for substitution. It is also possible to have a scheme in which the two least significant bits of each pixel is replaced by two message bits and so on.

C O PY

R

Image downgrading is a special case of LSB substitution system in which images act both as secret messages and covers. Kurak and Hugh [6] pointed out the security threat posed by image downgrading in 1992. They pointed out that image downgrading could be used to exchange images covertly. Consider a case where the cover image and the secret message are both of the same size. The sender would then replace the four least significant bit planes of the cover’s pixel values with the four most significant bit planes of the pixel values of the secret image. The receiver would then extract the four least significant bits out of the stego-image, thereby gaining most of the information content of the secret image. Careful choice of cover images to suit secret images would lead to the substitution being visually imperceptible. B. Transform Domain Techniques LSB modification techniques are easy ways to embed information but are highly vulnerable to even small modifications to the cover image. Any image processing transformation can easily

******** Stego-Image Algorithm ********** For each message bit do Choose one cover- block Compute DCT of the block, B. If message bit is zero If (B (u1, v1) > B (u2, v2)) then Swap B (u1, v1) and B (u2, v2) Else If Bi (u1, v1) < B (u2, v2) then Swap Bi (u1, v1) and B (u2, v2) Adjust both values so that B (u1, v1) - B (u2, v2) > X Create stego-image out of inverse DCT of all blocks. ******* Image Extraction Algorithm ******* For each block Compute its DCT If (B (u1, v1) <= B (u2, v2)) then extracted bit is 0 else extracted bit is 1 This method is robust against compression especially JPEG as it is based on the same

Data hiding in grey scale images is based on the fact that any legal character can be considered as a number N from 0 to 255. This means we can expand it in base 4 as: N = R1*43 + G1*42 + R2*4 + G2,

R

C O PY

Data hiding in color images is based on the fact that we can expand N in base 3 as: N = R1*35 + G1*34 + B1*33 + R2*32 + G2*31 + B3,

The values of R1, G1, B1, R2 G2, B2 range from 0 to 2. Any pixel in the cover image is a (red, green, blue) triple. Suppose that pixel is (rr, gg, bb). Then we can replace the pixel's original remainder modulo 3 with the information data, overwriting the pixel with (rr - ( rr (mod 3)) + R1, gg - ( gg (mod 3) ) +G1, bb - ( bb (mod 3) ) + B1 ).That is, we overwrite the pixel with, (rr/3 + R1,

gg/3 + G1, bb/3 + B1), where any m/3 denotes the largest multiple of 3 that does not exceed m. Next, we do the same with the next pixel (rrr , ggg ,bbb ), overwriting the pixel with ( rrr/3 + R2, ggg/3 + G2, bbb/3 + B2 ). To extract the information, all we need to know is that for each of the color channels,

re s

m= a1* 10(n-1) + a2* 10(n-2) +…… an* 10(n-1).

The second segment is thus stored over indices 2 to n+1. The sender then stores the kth character of the message in the element with index jk+n+1. Since the receiver has the same key and knowledge of the pseudorandom generator, he can use it as a seed for the pseudorandom generator and replicate the set of indices. Thus the distance between adjacent characters of the message is random and it becomes more difficult to detect the existence of hidden information through steganalysis. It is to be noted that one index can occur more than once in the sequence. Such a case is called collision. If collision occurs, the sender will try to insert more than one message character into one cover pixel, thereby corrupting them. To overcome the problem of collisions, the sender could keep track of the indices of all cover pixels used in a List L. If during the hiding process a cover pixel whose index has been obtained randomly has not been used prior, he adds its index to L and uses it. If however, the index of the cover pixel is already contained in L, he discards the element and chooses another cover pixel pseudo randomly. At the other end, the receiver applies a similar technique to retrieve the information hidden in the image.

ht s

IG Al H lr T ig S

PR

The values of R1, R2, G1, G2 range from 0 to 3. We can overwrite the pixel (A, A, A) in the grey scale image with (A - R1, A - G1, A) and we can overwrite the next pixel (B, B, B) with (B - R2, B - G2, B).Since each red, and/or green channel is changed by no more than 3, the change in color is nearly imperceptible. Since the value of the Blue channel of each pixel is unchanged, it is a trivial operation to invert this operation to obtain the original message. To make the information harder to detect, we can add some random noise to the grey scale image prior to embedding the data.

D

This method involves hiding a message in the color channels of a BMP image. It involves hiding messages in an image by slightly modifying the color or grey scale value of pixels in the image. We implemented two related techniques, one for hiding information in grey scale images and another for hiding information in color images.

C TE

C. Channel based Steganography

(mod 3) = For instance, suppose a pixel has value (r, g, b) in an image containing embedded information. Then r (mod 3) = R1 or r (mod 3) = R2. This process can be used on the red, green, and blue channels of all the pixels to extract the information. Because any red, green, or blue channel datum is changed by no more than 2, the information is extremely hard to see. Furthermore, this technique has the advantage that it does not require a conversion of the image into grey scale, so the images can be even more innocuous. We have extended the basic algorithms mentioned above by incorporating a pseudo random sequence generator to spread the secret message over the cover in a rather random manner. The idea behind this is that both communication partners share a secret key k usable as a seed for a random number generator. The sender would create a sequence of element indices j1… jm+n. He would then store the size of the second segment, n, at index j1. The second segment consists of the size of the hidden message m, stored in base 10 as a1 a2 a3.. an. i.e.

er O ve T E d

principle. The quality factor is not affected however; the secret-message capacity is much lower than substitution based methods.

4. Results This section presents the results of the various steganographic algorithms that have been discussed. The results are analyzed on the basis of information hiding capacity and robustness against modification. We will not be analyzing the algorithms on the basis of the visual quality of the Steg-image obtained because we have assumed that all the images obtained will be of sufficient quality as to avoid visual detection.

Figure 4 shows the results obtained through the process of image downgrading. The objective here was to hide the map shown in Figure 3 in the tree image shown in Figure 1.The quality of secret image retrieved can be improved by using more than just the least significant bit plane of the cover image to store the secret image. Figure 4 shows the results obtained when we replace one , two , … seven least significant bit planes of the cover image with the corresponding number of most significant bit planes of the secret image.

C TE

D

Figure 1 shows the cover images that are used for hiding information through various algorithms.

Figure 1: Cover Images

N=2

Figure 2: LSB Substitution

R

LSB substitution algorithms can be used to embed a lot of information in an image. If care is taken while selecting the pixels to be modified to store message bits, the information hidden through this algorithm may survive transformations such as cropping, any addition of noise or lossy compression however is likely to lead to loss of information. An even better attack would be to simply set the LSB bits of each pixel to one fully defeating the watermark with negligible impact on the cover image. Furthermore, once the algorithm is discovered, the embedded watermark could be easily modified by an intermediate party.

C O PY

re s

N=3

ht s

IG Al H lr T ig S

PR

Figure 2 shows the results obtained through the LSB substitution algorithm.

er O ve T E d

N=1

N=4

N=5

N=6

N=7 Figure 4 : Image Downgrading With Substitution of N bit planes Figure 3 : Secret Image

Figure 5 shows the results obtained through the DCT based transformed algorithm. A 8kb text file was successfully hidden in the image shown and recovered without any errors.

as BMP images are not as common as other compressed image formats and consequently may be more closely scrutinized. 5. Conclusion

D

The focus of our paper has been to look at various techniques available for Steganography in images. We have analyzed different algorithms on the basis of the amount of information that can be hidden, while retaining sufficient visual quality of the cover image. We also looked at the susceptibility of each steganographic system to detection via steganalysis. We found that the channel based steganographic algorithms were the best overall among the algorithms we reviewed.

C TE

Steganographic systems embedding information in the frequency domain of the image are usually much more robust than systems embedding information in the time domain. However a tradeoff exists between the amount of information hidden and the robustness obtained. Significantly lesser amount of information can be hidden using this algorithm when compared to the LSB substitution algorithms. For example in a 480kb image using the LSB algorithm we can store up to 60 kb of information. In contrast a DCT based algorithm can be used to store approximately 8kb of information.

PR

er O ve T E d

In the future we intend to look at other steganographic techniques such as Steganography algorithms based on stochastic modulation and the algorithms based on the spread spectrum model. References

re s

[1] Stefan Katzenbeisser and Fabien Petitcolas, “Information Hiding: Techniques for Steganography and Digital Watermarking” , Artech House, 2002

[2] Lisa M. Marvel, Charles G. Boncelet and Charles T. Retter, “Spread Spectrum Image Steganography”, IEEE transaction on image processing, Vol. 8, NO. 8, AUGUST 1999

ht s

IG Al H lr T ig S

Figure 5: Steg image Obtained after DCT based Steganography

C O PY

R

Figure 6 shows the results obtained through channel based algorithms for grey scale images and color images respectively.

Figure 6: Steg images Obtained after channel based Steganography for grayscale and color images

The channel based techniques can be used for storing a lot of information. For example using the channel based algorithm for gray scale and color images we can store up to 80kb of information. Steganographic systems based on this algorithm are also highly resistant to attacks involving standard image processing techniques because this algorithm stores the message in the image by slightly modifying the color of each pixel. But this algorithm can be used only for BMP images. This can be a significant drawback

[3] Bender W.,D. Gruhl, and N. Morimoto, “Techniques for Data Hiding”, IBM Systems Journal Vol. ,no. 3/435 [4] R. Chandramouli, Nasir Memon , “Analysis of LSB based Image Steganography techniques”, Proceedings IEEE ICIP, 2001 [5] Faisal Alturki, Russell Mersereau “Secure Blind Image Steganographic technique using Discrete Fourier Transformation” in IEEE CSAC 1990. [6] Kurak C., and J. McHughes, “A cautionary note on image downgrading” in IEEE CSAC 1992. [7] Newman B., Secrets of German Espionage, 1940 [8] Johnson , N.F., and S.jajodia, “Exploring Steganography: Seeing the Unseen, ” IEEE Computer Vol 31, no. 2,1998 [9] G. Langelaar, I. Setyawan, R.L. Lagendijk, “Watermarking Digital Image and Video Data”, in IEEE Signal Processing Magazine, September 2000 [10] http://www7.informatik.tu-muenchen.de/~katzenbe/

[11] http://wind.in.tum.de/~katzenbe/

[12] http://www.forensics.nl/steganography

Steganography: Data Hiding in Images

cryptography. Cryptography concentrates more on the security or encryption of data whereas. Steganography aims to defeat the knowledge of encryption of the message. Watermarking is about protecting the content in images;. Steganography is all about concealing the very existence of hidden information. The paper looks.

Download PDF

229KB Sizes 0 Downloads 317 Views

Report

Steganography: Data Hiding in Images

Recommend Documents