Optimal Cover Estimation Methods and Steganographic ...

Viewer
Transcript

1

Optimal Cover Estimation Methods and Steganographic Payload Location Tu-Thach Quach

Abstract—Cover estimation is an important part of steganalysis and has many applications. One such application is steganographic payload location using residuals, which is effective when a large number of stego images are available. In the ideal case when the cover images are available, we show that the expected number of stego images needed to perfectly locate all loadcarrying pixels is approximately the logarithm of the payload size. In more practical settings when the cover images are not available, the accuracy of payload location depends primarily on the chosen cover estimation method. We present optimal, linear run-time algorithms for finding the most likely cover estimate given the stego image and experimentally demonstrate that they can be used to locate payload on both LSB replacement and LSB matching stego images. The algorithms can be extended to higher-order statistical models of cover images. Index Terms—Steganalysis, Payload Location, Cover Estimation, Viterbi Decoding

EDICS: WAT-STEG I. I NTRODUCTION Digital image steganography embeds messages into cover images to produce stego images, which appear innocuous to an unintended observer. A widely used method in digital image steganography is least-significant bit (LSB) replacement, where a portion of the LSBs of the cover image is replaced with the message bits. By manipulating only the LSBs, the stego image looks similar to the cover image making it difficult to detect by steganalysis detectors. Due to embedding, however, the stego image is clearly different from the cover image. These differences may reveal the presence of a hidden message. Once these differences are found, they can be used as features to train classifiers to detect stego images. Several detectors can be extended to estimate the size of the payload. This can be achieved by training classifiers using stego images carrying different payload sizes. The output of such classifier is no longer binary, but consists of an estimate of the payload size. More interesting are the Weighted Stegoimage (WS) detectors [1], [2], [3]. These detectors involve estimating the cover image given the stego image. The estimate is then used in combination with the stego image to estimate the payload size. These payload-size estimators are highly accurate with a mean absolute error of the order of 10−4 . The next logical step in steganalysis is to extract the hidden message from the stego image. There are two approaches to this problem. The first technique involves searching for the correct key used in the embedding process [4]. This method is applicable when the key space is small. The advantage is The author is with the Electrical and Computer Engineering Department, University of New Mexico, Albuquerque, NM 87131 USA (e-mail: [email protected], phone: 505-284-6280, fax: 505-284-6231)

that once the key is found, the hidden message can be extracted readily. The alternative is to locate the payload, based on residuals, using a number of stego images where each stego image has the payload at the same locations [3], [5]. This could happen if the steganographer uses a simple embedding algorithm, reuses the key, and the stego images are the same size. This method only locates the payload; it cannot extract the hidden message since no logical orderings can be inferred from the identified payload pixels. However, it is a crucial step in extracting the hidden message. In the ideal case when the cover images are available, we show that the expected number of stego images needed to perfectly locate all load-carrying pixels is approximately the logarithm of the payload size. With an embedding rate of 0.5 bits per pixel (bpp) on 512 × 512 images, or 131072 bits, on average, only 19 stego images are needed. In more practical settings when the cover images are not available, the success of payload location depends primarily on the chosen cover estimation method. We would like the estimate to be as close to the cover image as possible since the fidelity of the estimate directly influences the accuracy of payload location. Our approach is to find the most likely estimate, in a statistical sense, given the stego image. The problem of finding the most likely cover estimate has previously been studied [6]. In the original work, a secondorder statistical model of cover images is used with a treepruning algorithm to find the most likely cover estimate. Due to pruning, however, the presented algorithm is suboptimal. Despite being suboptimal, experimental results showed that it is effective at locating payload and outperforms state-of-the-art cover estimators. Our current work expands on the original work in several areas. We first present an optimal, linear time algorithm for finding the most likely estimate using a first-order statistical model of cover images. Then we adapt the tree-pruning algorithm so that only suboptimal paths are pruned resulting in an optimal, linear time algorithm for the second-order cover model. Both algorithms are based on Viterbi decoding and can be extended to higher-order cover models. Experiments are performed on a larger, more standardized set of images from the BOSSbase 0.92 database [7]. In Section II, we briefly review WS residuals, provide some bounds on the number of stego images needed to locate payload. Section III studies the cover estimation problem in depth and optimal algorithms for finding the most likely cover estimate are presented. We experimentally demonstrate that our algorithms can locate payload on both LSB replacement and LSB matching stego images in Section IV. Concluding thoughts are provided in Section V.

2

5

II. L OCATING PAYLOAD

1.4

ri = (si − sei )(si − cbi )

Given N stego images of the same size, the residual of pixel i in image j is (3)

The mean residual or the proportion of the number of images in which pixel i is flipped is ri· =

N 1 X rij . N j=1

1.2 1.1

Experiment Expected

1 0.9 0.8 0.7 0.6

0

5

10 Number of stego images

15

20

Fig. 1. Correctly identified payload pixels as a function of N stego images for the ideal scenario where the cover images are available. The payload is 0.5 bpp for a total of 131072 bits. The experimental result matches our expectation.

(1)

are computed, where sei indicates si with the LSB flipped. The residuals quantify the difference between the stego image and the cover estimate. If cbi is an unbiased estimator for ci , the estimation error is independent of the parity of ci , and the payload is independent of the cover, the expected values of the residuals ri satisfy 0, if si = ci , E[ri ] = (2) 1, if si = cei . rij = (sij − sf c ij )(sij − c ij ).

1.3

Located payload

We briefly summarize the WS method for locating steganographic payload using residuals as presented in [3], which also improved the WS detectors proposed in [1], [2] by adaptively determining the linear filter that best matches the stego image. The notations and (1) – (4) are taken from [3]. We denote a stego image as a vector s = (s1 , . . . , sn ) and the corresponding cover image as c = (c1 , . . . , cn ). A stego image is generated by embedding a payload of q bits per pixel (bpp) into the cover image using LSB replacement steganography (we use q instead of p to avoid confusion with probability which is used extensively here). The total number of bits embedded is therefore nq. The WS method first estimates the cover image using the stego image. We use the notation b c for the cover estimate. Once b c is obtained, the residuals

x 10

(4)

If each stego image has the payload at the same locations, we have the following expected values for the mean residuals. Using (2), if pixel i is not a load-carrying pixel, then E[ri· ] = 0. Otherwise, if pixel i is a load-carrying pixel, then E[ri· ] = 0.5. In practice, these expectations are observed for large N . This implies that if a large number of stego images are available, we can locate the payload pixels by computing the mean residuals. What is the best we can do in terms of the number of required stego images to locate some fraction of the payload? Without the embedding key, the best scenario is cbi = ci for all i. In other words, the cover images are available. For simplicity, we assume that the total number of bits is nq = 2k for some non-negative integer k. Since the payload is independent of the cover, with one stego image, we can locate about 2k−1 bits. With a second stego image, the expected number of unlocatable bits reduces to 2k−2 . In general, given N stego images, the expected number of unlocatable bits is 2k−N . Therefore, on average, we need k + 2 stego images to locate all bits. This is a promising result as we only need a relatively few (logarithm of the payload size) stego images to locate the payload.

To further support the above analysis, we perform the following experiment using a set of 1000 randomly chosen cover images from the BOSSbase 0.92 database, which consists of 9074 grayscale cover images of size 512 × 512 in the raw PGM format. A fixed payload of 0.5 bpp, or 131072 bits, is used to embed random bits using LSB replacement with the same key to generate 1000 stego images. For 10 iterations, as N increases from 1 to 20 (we expect to locate all payload pixels with log2 (131072) + 2 = 19 stego images), we randomly choose N stego images and compute the mean residuals between the stego images and their corresponding cover images. The payload pixels are those with non-zero mean residuals. We average the number of located payload pixels as a function of N over 10 iterations and plot the result in Fig. 1. For comparison, we also plot the expected curve. It is clear that the experimental result matches our finding. III. C OVER E STIMATION In practice, we are unlikely to have access to the cover images. Instead, we estimate the cover images using several heuristics. A simple method estimates a cover pixel by averaging its four neighboring pixels [1]. More sophisticated methods involving weighted averages can also be used [2]. It is important that we use a good estimator since it directly influences the accuracy of payload location. In a statistical sense, the best estimate is the most likely given the stego image: b c = arg max p(c|s)

(5)

c

= arg max p(s|c)p(c).

(6)

c

We have slightly abused the notation c, which now indicates a candidate cover image, not the original cover image. Finding the most likely cover estimate is then a maximum a posteriori (MAP) estimation problem. The model presented here is a Hidden Markov Model (HMM), where the observations and hidden states are si and ci , respectively. For convenience, we refer to the probability p(s|c)p(c) = p(c, s) as the score of c since s is unambiguous in the current context. For practical reasons, we make the following assumptions: Q 1) p(s|c) = i p(si |ci ),

3

Q 2) p(c) = i p(ci |ci−1 , ci−2 , . . . , ci−k ), for some finite positive integer k. The first assumption states that a stego pixel depends only on the current cover pixel. This is a valid assumption for many existing steganographic algorithms. In particular, the popular LSB replacement and LSB matching algorithms satisfy this assumption since the embedding process considers only the current cover pixel. The second assumption is a finite k-order Markov model of c. How to best model cover images is an open question. However, using a finite Markov model has the advantage of computational tractability. In other words, it would be difficult to compute prior probabilities for infinite or nonstationary models. The following discussion on computing these probabilities will further reinforce the assumptions made here. The likelihood probabilities p(si |ci ) are approximated from the estimated payload q:  q  1 − 2 , if si = ci , q , if si = cei , p(si |ci ) = (7)  2 0, otherwise.

The prior probabilities p(ci |ci−1 , ci−2 , . . . , ci−k ) are obtained by learning from a set of known cover images. As an example, for the second-order cover model, the prior probabilities p(ci |ci−1 , ci−2 ) are obtained as follows. For each cover image, we scan in four different directions: →, ←, ↓, ↑. In each direction, we collect counts of pixel triples ci , cj , and ck . The counts are collected row by row and column by column for the horizontal and vertical directions, respectively. The totals of the counts from all four directions of all images are subsequently used to compute the prior probabilities. Once the probabilities are obtained, we proceed to finding b c using a search or decoding algorithm that is appropriate for the chosen cover model. In the next two subsections, we describe decoding algorithms for the first and secondorder cover models. Then we extend our methods to estimate cover images for LSB matching stego images and provide implementation details. A. First-Order Decoding The first-order cover model assumes that the probability of ci depends only on ci−1 , i.e., the prior probabilities are p(ci |ci−1 ). Finding the most likely estimate according to this model is a well-known and well-studied problem; an optimal solution is obtained using the Viterbi algorithm. The Viterbi algorithm is a general technique for solving problems involving HMMs. The actual algorithm is problem dependent. Here, we describe the Viterbi-based algorithm for finding the most likely estimate as formulated in (6) using the first-order cover model. The Viterbi algorithm consists of two phases. The first phase has n steps, where n is the length of the stego sequence. In step 1, we compute the score v(c1 ) for each candidate cover pixel c1 : v(c1 ) = p(s1 |c1 )p(c1 ).

-2.77

-4.67

-5.94

-6.74

0

2

2

2

1

3

3

3

-1.67

-4.26

-6.57

-8.93

Fig. 2. First-order example with s = (1, 3, 3, 2) and payload q = 0.5 bpp. Candidate cover pixels at step i are shown as nodes in column i. The number outside of each node shows the logarithm (base e) of its score. Solid line denotes winning path and dotted line denotes non-winning path. The most likely cover estimate b c = (1, 3, 2, 2) is highlighted with gray.

to the hidden states of the HMM for observation si . Then in every subsequent step i, we compute v(ci ) for each candidate cover pixel ci : v(ci ) = max v(ci−1 )p(ci |ci−1 )p(si |ci ). ci−1

(9)

In case of a draw, we randomly choose one among the ci−1 ’s. By construction, at the end of step n, the candidate cover pixel cn with the maximum score can be used to identify the most likely cover estimate. If we keep track of the ci−1 used in (9), we can readily trace back to find this sequence. The trace back procedure is the second phase of the Viterbi algorithm. We demonstrate the key steps of the first-order decoder with a simple example. Let the probabilities be as shown in Table IV in the Appendix1. The stego sequence is s = (1, 3, 3, 2) and the payload is q = 0.5 bpp. The decoding process forms a trellis as shown in Fig. 2. There are four columns corresponding to the four steps in the algorithm. Column i shows the candidate nodes for si . The number outside of each node shows the logarithm (base e) of its score. At time step 1, two candidate nodes 0 and 1 are created for s1 = 1. The score of candidate node 0 is computed according to (8): log( 41 14 ) = −2.77. Similarly, the score of candidate node 1 is log( 43 14 ) = −1.67. In step 2, two candidate nodes 2 and 3 are created for s2 = 3. Their scores are computed using (9). For candidate node 2, the maximum score is obtained using the path from candidate node 1 in the previous step. This winning path is denoted with a solid line and the non-winning path is denoted with a dotted line. The same is done for candidate node 3. This process continues for the remaining two steps. At the end of step 4, candidate node 2 has the best score. The path along the solid lines from candidate node 1 in step 1 to candidate node 2 in step 4 (highlighted with gray color), therefore, forms the most likely cover estimate b c = (1, 3, 2, 2). The number of candidate nodes at each step is constant and independent of n, the length of the stego sequence. The entire algorithm uses n steps. Therefore, the run-time of the algorithm is O(n). We emphasize that the algorithm finds an optimal solution to (6) for the first-order cover model.

(8)

Under LSB replacement steganography, each si has two candidate cover pixels: si and sei . These candidates correspond

1 For the first-order model, the second-order conditional probabilities are not needed, but they are included in the table for convenience since the same probability table will be used in later examples.

4

B. Second-Order Decoding For higher-order cover models, different decoding algorithms must be used. We describe two decoding algorithms for the second-order model. The first algorithm was presented in the original work and is suboptimal [6]. It, however, motivates the development of the second, optimal algorithm presented here. Both algorithms are not limited to the second-order model and can be extended to higher-order cover models straightforwardly. We first describe the suboptimal algorithm based on tree pruning. Its analysis yields insights for the construction of the optimal algorithm. 1) Pruning Decoding: A straightforward algorithm for finding the most likely estimate is to enumerate all possible estimates and choose the one with the best score. This process can be seen as constructing a tree of candidate pixels. To illustrate this process, we use the same example where s = (1, 3, 3, 2) and the payload is q = 0.5 bpp. The probabilities are shown in Table IV in the Appendix. The tree is shown in Fig. 3. At level 0, we have the root node. Level 1 is constructed by extending the root node with candidate nodes. In the example, since s1 = 1, two candidate nodes 0 and 1 are added to the root node. At level 2, candidate nodes 2 and 3 are added to each node in level 1 to form four possible paths. The process continues through level n, at which point, the full tree is formed. The path from the root node (ignoring the root node) to each leaf node corresponds to a candidate cover sequence. The path with the best score is then the most likely estimate b c = (1, 3, 2, 2), highlighted with gray. It is clear that by enumerating all possible candidates, the number of paths is exponential with respect to n. This is intractable for any reasonably sized stego sequence. A solution to this problem is pruning. At each level, only the best k paths are kept and the remaining paths are pruned from the tree. The drawback of pruning is the risk of eliminating optimal sequences and any solution obtained subsequently is therefore suboptimal. We refer to this strategy as best-k pruning. Since the number of paths at any level is upper bounded by k, which is independent of n, and there are n steps in the algorithm, the run-time of the algorithm is O(n). The multiplicative constant in O(n), however, may be large due to the overhead of keeping track of the best k paths. 2) Optimal Decoding: Is it possible to prune the tree in such a way that only suboptimal paths are removed? If this can be done in linear time, then we are guaranteed to find an optimal solution efficiently. In Fig. 3, we observe the followings. At level 3, when we compute the scores for paths (0, 2, 2) and (1, 2, 2), only one of these two paths should continue and the other path must be pruned. To see why, let’s assume that path (1, 2, 2) has a higher score than path (0, 2, 2). Then for any candidate pixel c4 , the score of path (1, 2, 2, c4 ) is at least as high as that of path (0, 2, 2, c4). This is because the score of path (1, 2, 2, c4) is the score of path (1, 2, 2) times the product p(c4 |2, 2)p(s4 |c4 ) and the score of path (0, 2, 2, c4 ) is the score of path (0, 2, 2) times the same product p(c4 |2, 2)p(s4 |c4 ). Since path (1, 2, 2) has a higher score than path (0, 2, 2), the result follows. Similar arguments apply to other pairs of paths that have identical candidate pixels c2 and c3 such as (0, 2, 3) and

-2.77

-5.77

-6.28

-6.40

0

0,2

2,2

2,2

-5.36

-6.57

-8.53

0,3

2,3

2,3

1,2

3,2

3,2

-4.67

-5.76

-7.77

1

1,3

3,3

3,3

-1.67

-4.26

-6.34

-8.02

Fig. 4. Second-order optimal decoding example with s = (1, 3, 3, 2) and payload q = 0.5 bpp. States formed by tuples (ci−1 , ci ) at step i > 1 are shown as nodes in column i. The number outside of each node shows the logarithm (base e) of its score. Solid line denotes winning path and dotted line denotes non-winning path. The most likely cover estimate b c = (1, 3, 2, 2) is highlighted with gray.

(1, 2, 3). Furthermore, this pruning procedure applies to subsequent level i. That is, for any two paths ending with (. . . , ci−2 , ci−1 , ci ) and (. . . , c0i−2 , ci−1 , ci ), where ci−2 6= c0i−2 , the higher score path is kept and the lower score path is pruned because the scores of future paths depend on ci−1 and ci , which are identical in both paths. With these observations, we have the following optimal second-order algorithm, which consists of n steps. In step 1, we compute the score v(c1 ) for each candidate cover pixel c1 using (8). In step 2, we compute the score v(c1 , c2 ) for each candidate pair c1 and c2 : v(c1 , c2 ) = v(c1 )p(c2 |c1 )p(s2 |c2 ).

(10)

Then in every subsequent step i, we compute v(ci−1 , ci ) = max v(ci−2 , ci−1 )p(ci |ci−1 , ci−2 )p(si |ci ). ci−2

(11) We note the similarity between (9) and (11). In the latter, rather than maximizing over ci−1 , we maximize over ci−2 to accommodate for the second-order model. Also, the hidden states of the HMM are no longer the ci ’s, but are the tuples formed by (ci−1 , ci ). It should be clear that the algorithm presented here applies to higher-order models as well. For the k-order model, for example, the hidden states are the vectors formed by (ci−k+1 , . . . , ci−1 , ci ) and the maximization in (11) is taken over ci−k . With this generalization, the first-order decoder can be seen as a special case where k = 1. We illustrate the key steps of the algorithm using the same example where s = (1, 3, 3, 2) and the payload is q = 0.5 bpp. The decoding process forms a trellis as shown in Fig. 4. In the first column, we have two states corresponding to the two candidate pixels 0 and 1. The score (logarithm base e) of each state is shown as the number next to it. In column 2, we have four nodes corresponding to four hidden states formed

5

root

0

1

2

2

2

3

3

2

3

2

2

3

2

3

3

2

3

2

3

2

-7.68 -10.98 -9.34 -10.03 -8.09 -10.03 -8.02 -8.02 -6.67

3

3

2

2

3

2

3

3

2

3

-9.97 -7.77 -8.46 -6.40 -8.35 -8.53 -8.53

Fig. 3. Tree construction example with s = (1, 3, 3, 2) and q = 0.5 bpp. The tree is formed by enumerating all possible candidate cover sequences. Each path from the root node to a leaf node corresponds to a candidate cover sequence. The logarithm (base e) of its score is shown as the number below it. The most likely cover estimate b c = (1, 3, 2, 2) is highlighted with gray.

by (c1 , c2 ). Similarly, column 3 has four hidden states. The winning path is denoted with a solid line and the non-winning path (pruned) is denoted with a dotted line. At the end of step 4, the most likely candidate sequence is readily identified by tracing back along the solid lines from the best scoring node to obtain b c = (1, 3, 2, 2), highlighted with gray. The number of states at any step i > 1 is the number of tuples formed by candidate pixels (ci−1 , ci ) (four for LSB replacement), which is independent of n. There are n steps in the algorithm. Therefore, in linear time, the algorithm finds an optimal solution for the second-order model. As noted earlier, the optimality and linear time properties of the decoder hold for higher-order models. C. Extension to LSB Matching To find the most likely estimate for LSB matching stego images, we only need to use a different likelihood probability function:  1 − q2 , if si = ci ,    q  if si = ci ± 1 and 1 ≤ ci ≤ 254,  4, q , if (ci = 0 and si = 1) or p(si |ci ) = 2   (ci = 255 and si = 254),    0, otherwise. (12) For payload location, the residuals in (1) apply only to LSB replacement steganography due to the asymmetry of LSB replacement. In our current scheme, the residual ri indicates whether the cover estimate and the stego image are different at location i. To extend it to LSB matching steganography, we adapt ri to ri = I(si , cbi ),

where I(x, y) is an indicator function 1, if x 6= y, I(x, y) = 0, otherwise.

(13)

(14)

This new form of ri is more general as it can be used in place of (1) to compute residuals for both LSB replacement and LSB matching stego images. D. Implementation We point out two details regarding the implementation of the above algorithms. First, throughout our examples, we have hinted at using log-probability instead of probability. This is to avoid zero saturation as the products of probabilities converge to zero quickly. Second, during our learning phase where we compute prior probabilities from known cover images, it is likely that we do not encounter all possible cover sequences. In such event, we end up with probabilities equal to zero, which may be problematic in the decoding phase. This is a missing data problem and is a part of the set of problems collectively referred to as the curse of dimensionality in machine learning literature. Higher-order models suffer from this problem more than lower-order models. To minimize the impact of this problem in the decoding step, we smooth the probabilities so that all probabilities are non-zero. In Fig. 5, we plot the conditional probabilities p(ci |ci−1 = 128) computed using a set of images from the BOSSbase database. We observe that zero probabilities occur toward the tail ends of the distribution away from the peak. Other conditional probabilities exhibit similar characteristic. The symmetry, of course, breaks down as ci−1 approaches 0 or 255. While there are several techniques that can be used to estimate distributions such as maximum-likelihood estimation, a detailed investigation is beyond the scope of this work. Rather, we observe that at the tail ends, the shape is almost straight. Therefore, to fill in the zero probabilities with nonzero values, we use linear interpolation. To be clear, we apply linear interpolation to the counts prior to computing the probabilities. This way, the computed probabilities are true probabilities, i.e., sum to one.

6

0.2 0.18 0.16

p(ci |ci−1 = 128)

0.14 0.12 0.1 0.08 0.06 0.04 0.02 0

0

50

100

150

200

250

ci

Fig. 5. Conditional probabilities p(ci |ci−1 = 128). Zero probabilities occur near the tail ends of the distribution.

We have implemented all three decoders in Java. On an Intel Xeon 2.66 GHz CPU, on average, the first-order decoder takes slightly more than half a second to decode a 512×512 image in all four directions. The second-order best-20 pruning decoder takes less than 6 seconds and the second-order optimal decoder takes less than 1 second. The software and documentation are available for download [8]. For portability, the software only supports raw, grayscale, uncommented PGM images. IV. E XPERIMENTS We now apply our decoders to locate payload using the method of residuals. We randomly select 8074 images out of 9074 cover images from the BOSSbase database to learn the probabilities. The remaining 1000 cover images are used for testing. The random selection process ensures that the images are unbiased, i.e., there is a fair mixture of scenes and textures. The important point is that the test set must not be used in the training step. For the test set, a fixed payload of 0.5 bpp (131072 bits) is used to embed random bits using LSB replacement with the same key to generate 1000 LSB replacement stego images. Using the same procedure, we generate another 1000 stego images using LSB matching. We present the results for each embedding scheme separately and contrast our results with existing cover estimators. Then we show the results without interpolating the zero probabilities. A. LSB Replacement For each MAP decoder (first-order, second-order best-20 pruning, and second-order optimal) and for each LSB replacement stego image, we obtain four cover estimates from four directions: →, ←, ↓, ↑. All four estimates are used to compute the residuals. In Fig. 6, we plot the histograms of the mean residuals using 1000 stego images. For comparison, we also plot the histogram of the mean residuals using the fourneighbor average estimator [3]. For the four-neighbor average estimator, we begin to see the formation of the two peaks: one at 0 and the other one at 0.5. The number of correctly identified payload pixels (true positives) is 112886 (86.13%) and the number of non-payload pixels identified as payload pixels (false positives) is 22892 (17.47%), where we have used a threshold ri· > 0.25 to identify load-carrying pixels due to symmetry of the residuals. In contrast, the separation

Fig. 7. Heat map of the mean residuals at the top left corner 20x20 region for the first-order decoder with payload locations marked by a white dot using N = 1000 stego images and payload q = 0.5 bpp. Locations with large mean residuals correspond with the load-carrying pixels.

TABLE I C ORRECTLY IDENTIFIED PAYLOAD PIXELS FOR LSB REPLACEMENT N 1 10 100 200 300 400 500 1000

First-Order 65734 92229 119140 123568 126506 127982 128944 130464

(50.15%) (70.37%) (90.90%) (94.27%) (96.52%) (97.64%) (98.38%) (99.54%)

SO Pruning 65851 101333 121524 124342 125496 126959 127897 129466

(50.24%) (77.31%) (92.72%) (94.87%) (95.75%) (96.86%) (97.58%) (98.77%)

SO Optimal 65862 99864 122006 125365 126919 128233 129166 130269

(50.25%) (76.19%) (93.08%) (95.65%) (96.83%) (97.83%) (98.55%) (99.39%)

of the two peaks is apparent for all three MAP decoders. The locations of the first and second peaks are not at 0 and 0.5, respectively. This is because our estimators are not unbiased; they are model and image dependent. Assuming the payload q is known, the payload pixels correspond to the nq locations with the largest mean residuals. In Fig. 7, we show the heat map of the mean residuals at the top left corner 20 × 20 region for the first-order decoder. Payload locations are indicated with a white dot. This confirms the locations with the largest mean residuals are indeed the load-carrying pixels. The accuracy of payload location is sensitive to the number of available stego images. We show the number of correctly identified load-carrying pixels as a function of the number of stego images for the first-order, second-order pruning, and second-order optimal decoders in columns 2, 3, and 4 of Table I, respectively. The results for all three decoders are quite similar. It is apparent that as more stego images are available, better accuracy is obtained: 90% accuracy of payload location is achieved with 100 stego images, 99% accuracy with 1000 stego images. This is clearly better than using the four-neighbor average estimator, but is far from the ideal number 19.

7

−1.5

0

−1

−0.5

0.05

0

0.5

1

1.5

2

0

0.02

0.04

0.06

0.08

0.1

rI .

rI .

(a)

(b)

0.1

0.15

0.2

0.25

0

0.05

0.1

0.15

0.12

0.2

rI .

rI .

(c)

(d)

0.25

0.14

0.16

0.3

0.18

0.35

Fig. 6. Histograms of the mean residuals with payload q = 0.5 bpp for N = 1000 stego images using (a) four-neighbor average estimator, (b) first-order decoder, (c) second-order best-20 pruning decoder, (d) second-order optimal decoder.

B. LSB Matching As discussed earlier, with likelihood probabilities in (12), the same decoders can be used to obtain cover estimates for LSB matching stego images. To this end, we perform the same experiment using LSB matching stego images. For comparison, we also use the Wavelet Absolute Moments (WAM) payload locator [5], [9], which is the state-of-theart payload locator for LSB matching steganography. Briefly, the WAM payload locator works as follows. The first-stage wavelet decomposition is performed on the stego image using the 8-tap Daubechies filter resulting in four subbands L, H, V, and D corresponding to the low frequency, horizontal, vertical, and diagonal subbands, respectively. The WAM filter is applied to subbands H, V, and D. The low frequency subband L is set to zero. The residuals are obtained by applying the inverse wavelet transform on the subbands and their absolute values are used to locate payload. To avoid near-edge problems, the WAM locator reflects pixels at the borders of the stego image to achieve the best accuracy. We also use border reflection in our experiment. In Table II, we show the number of correctly identified load-carrying pixels as a function of the number of stego images for the first-order decoder, the second-order optimal decoder, and the WAM payload locator. The WAM payload locator can locate only 72% of the payload even with 1000 stego images. In contrast, both first and second-order decoders achieve an accuracy of more than 90%. Comparing the results

TABLE II C ORRECTLY IDENTIFIED PAYLOAD PIXELS FOR LSB MATCHING N 1 10 100 200 300 400 500 1000

First-Order 65566 77536 107682 113920 113883 114578 115540 121911

(50.02%) (59.16%) (82.15%) (86.91%) (86.89%) (87.42%) (88.15%) (93.01%)

SO Optimal 65557 93687 113009 114702 115089 117602 118825 121031

(50.02%) (71.48%) (86.22%) (87.51%) (87.81%) (89.72%) (90.66%) (92.34%)

WAM 65956 90725 93714 92666 92221 93705 93376 93949

(50.32%) (69.22%) (71.50%) (70.70%) (70.36%) (71.49%) (71.24%) (71.68%)

to those in Table I for LSB replacement shows that it is more difficult to locate payload embedded by LSB matching than LSB replacement. This coincides with the fact that LSB matching is known to be more difficult to detect than LSB replacement. Perhaps the difficulty lies in the symmetry of LSB matching that is not present in LSB replacement leading to more uncertainty in the decoding process. C. Effect of Smoothing The dimension of the probability space for the first-order model is 2562 while the dimension of the probability space for the second-order model is 2563 . This is a significant increase in volume going from the first to the second-order model. As such, we expect to have many more zero probabilities in

8

TABLE III C ORRECTLY IDENTIFIED PAYLOAD PIXELS WITHOUT INTERPOLATION N 1 10 100 200 300 400 500 1000

First-Order 65734 92229 119140 123568 126505 127982 128944 130466

(50.15%) (70.37%) (90.90%) (94.27%) (96.52%) (97.64%) (98.38%) (99.54%)

SO Pruning 65851 101562 121566 123982 124921 126575 127605 129392

(50.24%) (77.49%) (92.75%) (94.59%) (95.31%) (96.57%) (97.35%) (98.72%)

SO Optimal 65863 99911 121908 125089 126599 128127 129045 130251

(50.25%) (76.23%) (93.01%) (95.44%) (96.59%) (97.75%) (98.45%) (99.37%)

the second-order model than the first-order model. Using 8074 cover images, the first-order model has 1268 conditional probabilities that are zero (less than 2% of the probability space). In contrast, the second-order model has 5110414 conditional probabilities that are zero (30% of the probability space). This implies that we are more likely to encounter unseen sequences during the decoding process of the second-order model than the first-order model. As discussed earlier, to minimize the impact of this problem, we use linear interpolation. Here, we present the results without interpolation. In Table III, we show the payload location accuracy on LSB replacement stego images for the first-order, secondorder pruning, and second-order optimal decoders without interpolation. As seen by comparing the results with those in Table I, interpolation virtually has no effect on the firstorder decoder, which is what we have expected since the firstorder model has relatively few zero conditional probabilities. Interpolation does slightly increase the accuracy of the secondorder decoders, especially the second-order pruning decoder. The slight gain in accuracy is due to the fact that interpolation only affects the zero probabilities, which lie at the tail ends of the distribution. The effect of changing these zero probabilities to non-zero probabilities is negligible since the probabilities at the tail ends are almost zero. Nonetheless, interpolation should be applied to avoid problems with unseen sequences in the training phase, which is even more likely to occur in higherorder models. V. C ONCLUSION Many problems in steganalysis require estimating the cover image from the stego image. One such problem is payload location. We proposed several optimal, linear time algorithms for finding the most likely cover estimate using MAP estimation based on finite-order Markov cover models. Experimental results showed that these cover estimates can be used effectively to locate payload. The fact that we can locate payload with the first-order model suggests that some statistical properties of cover images are captured by such simple model and the embedding process violates these properties. We should emphasize that the presented approach is not restricted to digital images and may be applied to other media such as digital audio and video. Although we showed that our proposed method can be extended to higher-order cover models, there are practical

challenges that still need to be addressed. Assuming that we use four bytes to store the probabilities, for the secondorder model, we need 4(2563 ) ≈ 67 MB. While the memory requirement is relatively small, with a third-order model, we would need 17 GB. This would be a challenge even for highend workstations. We firmly believe that cover estimation can further be improved using high-order cover models. For the payload location problem, however, using higher-order models is not as critical. Our experiments showed that with 100 or more stego images, both first and second-order models performed equally well. This is, however, an inherent consequence of the method of residuals. In the limit, as N → ∞, these methods, including the four-neighbor average, can all be used to locate payload with high accuracy. The advantage of the second-order model is seen when the number of stego images is small. With 10 LSB matching stego images, the second-order model outperformed the first-order model by more than 10%. Cover estimation can be applied to other problems in steganalysis and data hiding where information about the cover source is useful. A recent state-of-the-art steganalysis detector, for example, relied on features based on residuals computed as averages of neighboring pixels [10]. It may be possible to improve the accuracy of this detector by using residuals computed from MAP estimates. Finally, our work is an important step toward a generative model of steganography where stego images are constructed using the statistical properties of cover images. In contrast, existing steganography schemes are based on modifying a cover image to produce the stego image. A recent work on practical embedding methods based on Syndrome-Trellis Coding suggested that embedding capacity has reached its limit [11]. With a generative model, we may be able to achieve a higher capacity. We postpone investigating this problem to our future work. A PPENDIX E XAMPLE P ROBABILITIES R EFERENCES [1] J. Fridrich and M. Goljan, “On estimation of secret message length in LSB steganography in spatial domain,” in Security, Steganography, and Watermarking of Multimedia Contents VI, vol. 5306. SPIE, 2004, pp. 23–34. [2] A. D. Ker and R. B¨ohme, “Revisiting weighted stego-image steganalysis,” in Security, Forensics, Steganography and Watermarking of Multimedia Contents X, vol. 6819. SPIE, 2008. [3] A. D. Ker, “Locating steganographic payload via WS residuals,” in 10th Multimedia and Security Workshop. ACM, 2008, pp. 27–31. [4] J. Fridrich, M. Goljan, and D. Soukal, “Searching for the stego-key,” in Security, Steganography, and Watermarking of Multimedia Contents VI, vol. 5306. SPIE, 2004, pp. 70–82. [5] A. D. Ker and I. Lubenko, “Feature reduction and payload location with WAM steganalysis,” in Media Forensics and Security XI, vol. 7254. SPIE, 2009, pp. 0A01–0A13. [6] T.-T. Quach, “On locating steganographic payload using residuals,” in Media Watermarking, Security, and Forensics XIII, vol. 7880. SPIE, 2011. [7] T. Filler, T. Pevn´y, and P. Bas, “Break our steganography system,” July 2010, http://boss.gipsa-lab.grenoble-inp.fr. [8] T.-T. Quach, “Optimal cover estimation,” 2011, http://ece.unm.edu/˜tuthach/decoder.html. [9] M. Goljan, J. Fridrich, and T. Holotyak, “New blind steganalysis and its implications,” in Security, Steganography, and Watermarking of Multimedia Contents VIII, vol. 6072. SPIE, 2006.

9

TABLE IV E XAMPLE P ROBABILITIES ci

p(ci )

ci−1

ci

p(ci |ci−1 )

ci−2

ci−1

ci

p(ci |ci−1 , ci−2 )

0 1 2 3

1/4 1/4 1/4 1/4

0 0 0 0 1 1 1 1 2 2 2 2 3 3 3 3

0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3

2/5 3/10 1/5 1/10 3/10 2/5 1/5 1/10 1/10 1/10 3/5 1/5 1/16 1/16 3/4 1/8

0 0 0 0 1 1 1 1 2 2 2 2 3 3 3 3

2 2 3 3 2 2 3 3 2 2 3 3 2 2 3 3

2 3 2 3 2 3 2 3 2 3 2 3 2 3 2 3

7/8 1/8 1/2 1/2 4/5 1/5 9/10 1/10 9/10 1/10 2/5 3/5 7/10 3/10 1/4 3/4

[10] J. Kodovsk´y and J. Fridrich, “Steganalysis in high dimensions: fusing classifiers built on random subspaces,” in Media Watermarking, Security, and Forensics XIII, vol. 7880. SPIE, 2011. [11] T. Filler, J. Judas, and J. Fridrich, “Minimizing embedding impact in steganography using trellis-coded quantization,” in Media Forensics and Security XII, vol. 7541. SPIE, 2010.

Optimal Cover Estimation Methods and Steganographic ...

WAM locator reflects pixels at the borders of the stego image to achieve the best ... We also use border reflection in .... http://ece.unm.edu/Ëtuthach/decoder.html.

Download PDF

185KB Sizes 1 Downloads 223 Views

Report

Optimal Cover Estimation Methods and Steganographic ...

Recommend Documents