Abstract— Almost all existing multi-description coding (MDC) schemes are designed for media streaming over Internet. In this work, a wavelet-based video MDC technique is introduced that fits the criteria of media streaming over peer-to-peer networks. Our proposed method assigns descriptions to the senders due to their characteristic (i.e. bandwidth and availability). In contrast to traditional MDC, different descriptions in the proposed method have different importance in remaking the original media. Our simulation results show considerable improvement of video quality at the receiver (up to 10 dB) as compared to the state-of-the-art. Index Terms— Hierarchical Layered Encoding, Multiple Description Coding, Peer-to-Peer Networks, Wavelet Transform.

I. INTRODUCTION

D

UE to increase in popularity of Peer-to-Peer (P2P) networks, real-time transmission of multimedia over these networks has attracted many researchers [1,2]. The idea was first more widely deployed within Napster [3] followed by many other systems such as Gnutella [4], Chord [5], etc. Unlike the traditional approach of downloading the whole video before watching it [6]; they hoped to enable the user to start watching the media as soon as she connects to the server [7,8]. Because of the inherent limitations of P2P networks (e.g., upload bandwidth and sudden failure of peers); a reliable and quality real-time streaming scheme over such networks must use several senders [9,10]. This way, the effect of the failure of a sender or two on receiver's satisfaction can be reduced considerably. In a multi-sender approach each sender should send a part of Manuscript received January 31, 2007. This work was supported in part by the Iran Telecommunication Research Center (ITRC) and Advanced Communications Research Institute (ACRI). M. H. Firooz is with School of Electrical and Computer Engineering, University of Tehran, Tehran, Iran (corresponding author to provide phone: +98-913-116-1340; e-mail: mhfirooz@ yahoo.com). K. Ronasi., is with School of Electrical and Computer Engineering at the University of Tehran, Tehran, Iran. (e-mail: [email protected], [email protected]). M. R. Pakravan is with the Electrical Engineering Department, Sharif University of Technology, Tehran, Iran. (e-mail: [email protected]). A. N. Avanaki, is with School of Electrical and Computer Engineering at the University of Tehran, Tehran, Iran. (e-mail: [email protected]).

the media. There are a number of algorithms telling which part of media to be sent by which sender. For example, Itaya and others [11,12] divided the output stream of a video compression method between senders and added enough redundancy to each part to be recoverable at the receiver. Although their algorithm is intuitively simple, its implementation is hard requires considerable increase in the output bit-rate. In some previous work [13,14], hierarchical layered coding (HLC) is used for this purpose. In an HLC approach (a.k.a. scalable video coding), a video is divided to multiple layers. At the receiver, for decoding each layer (improving the quality of the received media), one needs its previous layer. The base (lowest) layer is provides the basic quality and is necessary for decoding the stream [15]. Therefore, the base layer should be transmitted through a reliable and error-free channel. Unfortunately, there is not such a channel on a P2P network: no node can be guaranteed to always remain on the network to bear the responsibility of base layer transmission. For media streaming over P2P networks, multiple description coding (MDC) in a multi-sender scenario proved to be an appropriate choice [16,17]. MDC methods that are designed for streaming over Internet (e.g., CDN network [18]) are, however, not suitable for P2P. That is because of different characteristics of the senders’ in the two: the senders on a P2P network, in contrast to the servers on Internet, are ordinary nodes with limited computational power and (shared) bandwidth that may leave the network at any time with a considerable probability [19]. Since the traditional MDC approaches are designed for reliable senders (members of CDN, for example), by design, they can only mitigate path congestion and packet loss [20,21]. That is why application of MDC is not common in P2P networks. MDC-based video streaming in CDN and P2P networks are compared in [22]: it is claimed that P2P-based streaming performs better when the number of nodes with the desired video is relatively high (40% in their simulation). That is because in such a case, the chance of having a sender close to the receiver increases dramatically. Hence, packet loss and end-to-end delay is considerably diminished. However, a 40% content availability is an unrealistic assumption. In this paper, we introduce a MDC method that is designed

to be used in P2P networks. We show that the proposed method provides an acceptable video quality at the receiver. We also compare our proposed method to IF-MDVC [23] and WBMDVC [24], two MDC methods that are designed for video transmission over IP and show a 12dB SNR improvement of the received video. In contrast to the existing MDC algorithms, the descriptions are not equally important in our method. The more important descriptions can be sent by more reliable senders. Therefore, we call the proposed method Un-balanced Un-equivalent MDC (UUMDC). Wavelet coefficients of an image bear different worth of information for reconstruction of that image. Chang and others proposed a more flexible HLC that uses wavelet-based compression instead of MPEG-2 [25] with only two levels of hierarchy. Our work differs from [25] as it is designed to operate over multiple nodes (senders). Moreover, our method provides variable-importance multiple descriptions and not hierarchical layers which are not suitable for P2P networks. In addition to providing high quality video at the receiver, UUMDC is transparent to video compression and can be easily implemented. Because of its wide usage, we have used MPEG2 as our video compression engine, tweaked a little to be compatible with wavelet coefficients rather than natural footage. The rest of the paper is organized as follows. In Section II, the proposed method is introduced. Section III explains the modifications to MPEG for our purpose. In Section IV, simulation results are presented and the proposed method is compared to the existing methods. The paper is concluded in Section V. II. PROPOSED METHOD Use of a multi-sender algorithm for media streaming over P2P networks is inevitable, as the senders are unreliable. The part sent by each of them should be self-contained in the sense that it can be decoded independent of the other parts (sent by other senders). That is why we choose MDC among various classes of video coding methods. The basic idea of MDC is generation of multiple descriptions of the source such that each describes the source with certain fidelity. When more than one description is available, they can be combined to enhance the quality of the received media. The availability probability of senders on P2P networks varies considerably. Since the availability probability of a peer presents the amount of its participation in the network, it implies its reliability. Intuitively, we’d better send more important information of the media through the more reliable senders (with higher availability probability and/or committed bandwidth) and send the less important through the less reliable senders, to deliver the most amount of information about the desired media to the receiver through such network. The existing MDC methods, however, generate equally

important descriptions. Therefore, we construct a new method from MDC and HLC with the desirable properties of the two methods that is well-suited to our application. That is, our method, similar to MDC, divides the media into parts that can be decoded independently, and similar to HLC, the importance of that part in media reconstruction is variable. In the wavelet transform of a natural image, the lower frequency coefficients have higher importance. That is because the natural images are low-pass. On the other hand, the high frequency coefficients (closely related to the edges) alone can make an a silhouette of the image (Figure 1). Furthermore, there is a close correlation between motion in a sub-band with motion in the other sub-bands that can be exploited to reconstruct the lost sub-bands.

Figure 1: Only the high frequency coefficients of the right image (Barbara) are used to make the left one (wavelet transform: 2-level db21; LL coefficients are set to zero).

Popular video compressions techniques such as MPEG-x or H.26x are not using wavelet transform. Thus, we used the structure shown in Figure 2. Some operations such as scalability and partition to descriptions should be performed in the pre-processing block, whose output must be also transmitted to the receiver. Pre Processing

Post Processing

Compressor

decompressor

Figure 2: The structure of the proposed algorithm.

The wavelet coefficients are calculated by convolution of the input image with a set of filters (see [24,26] for details). The coefficients of a wavelet transform are not in the range [0,255]. To make the blocks compatible to video compressions, the coefficients must be scaled. Sub-band jth in mth level of wavelet transform should be scaled as follows: 1

dbx is an abbreviation for Daubchies wavelet of order x.

B = fix ((B − MIN mj ) × 255 / MAX mj )

(1)

where MINjm and MAXjm are the minimum and maximum values of that sub-band respectively (see Part C of this section). For the compressor block we used MPEG because of its popularity. Simulation results present that direct application of MPEG on the wavelet blocks decreases the video quality just a little at a constant output bit-rate. This distortion can be tolerated because to avoid extra system complexity. A. Block Allocation According to Peers’ Bit-Rates First, the receiver should determine the number of levels of the wavelet transform as follows:

Rr min (Rp ) active

n' =

(2)

p∈P

n = log n4 ' After that all sub-bands will be divided to blocks with the size of LL sub-band. Figure 3 shows such partition for n = 2.

B. Prediction of Lost Blocks If a description is lost, parts of a sub-band (or a number of sub-bands) are lost. The lost parts can be reconstructed as follows. First, the residual matrix of the lost parts is set to zero. Motion vectors of blocks of the lost sub-band are predicted using the other sub-bands. Suppose the lost sub-band is related to level K (K=1, 2... n) and its number in this level is m (m = 1, 2, 3 and (if K = n) 4). If K = n (remember that n is level of wavelet transform) there are only 4 sub-bands. a1 , a2 and a3 are the motion vectors of the healthy sub-bands. The lower is the index; the higher is the importance of sub-band (Figure 5). The predicted motion vector of the lost sub-band is calculated using the formula below:

alost

a1 1 2 4 = . a2 7 7 7 a3

(4)

If K < n, there are only 3 sub-bands and the motion vector of the lost sub-band is calculated as:

2 3 a1 alost = . 8 8 a2

(5)

Note that only sub-bands of the same level are used for prediction. This is because the sizes of sub-bands in a level are the same. Using the motion vectors of the other levels increases the system complexity (because down-sampling or up-sampling are required) and due to our experience have very limited effect on the video quality. Figure 3: All blocks are divided to parts with the size of LL block.

a1

Node k should transmit the nk number of these blocks:

log n4 ' R nk = n ' . k log 4 Rmin in which: R min = min

for each sender i

(3) a2

(R i )

1

2

5

6

3

4

7

8

9

10

13

14

11

12

15

16

Figure 4: order of assignment to high availability senders.

The receiver allocates sub-blocks to the sender in the order shown in Figure 4. The more important parts (closer to top-left of the wavelet coefficients square) are transmitted by more reliable senders using the above order of assignment.

a3

Figure 5: Estimation of the lost part when the lost sub-band is the last subband, i.e. LL block.

Note that, in equation (5), the sum of the weights is not 1: a weight of 3/8 considered for zero motion vectors (i.e., no motion). C. Scaling factors Figure 7 depicts calculation of wavelet coefficients at the mth level using the coefficients of the (m-1)th level. Suppose Ami , j is the matrix of wavelet coefficient in level m. Each row of Ami , j should be passed through a filter. At point A, we have the following signal: 2 I −1

∑ Am ( k − n ,i ) g (k )

k =0

i = 1, 2..., I

Figure 6: Quantization matrixes for different sub-bands in Intra mode

Since Ami , j > 0, the maximum of the above formula is given 2 I −1

by: ∑ Am (k − n,i ) g (k ) g (k ) > 0

i = 1, 2..., I

k =0

∑ g (k ) g (k ) > 0

k =0

Similarly, the maximum of the signal at point B is given by: k =0

k =0

2 J −1

k =0

2 I −1

2 J −1

k =0

LH , HL : K m ∑ g ( k ) g ( k ) < 0 . ∑ h ( k ) h ( k ) < 0

If Km denotes the maximum of A , then the maximum value of the signal at A can be computed as follows:

2 I −1

2 J −1

2 I −1

i, j m

Km

2 I −1

LL : K m ∑ h ( k ) h ( k ) < 0 . ∑ h ( k ) h ( k ) < 0 k =0

1 if g (k ) > 0 g (k ) > 0 = 0 else

Km

The minimum values can be calculated in each sub-band by the following.

k =0

2 I −1

2 J −1

k =0

k =0

HH : K m ∑ g ( k ) g ( k ) < 0 . ∑ g ( k ) g ( k ) < 0

where: 1 if g (k ) < 0 g (k ) < 0 = 0 else

∑ g ( k ) g ( k ) > 0 . ∑ h( k ) h( k ) > 0

The maximum values of each sub-band at level m+1 can be calculated as follows: 2 I −1

2 J −1

LL : K m ∑ h(k ) h(k ) > 0 . ∑ h(k ) h(k ) > 0 k =0

k =0

2 I −1

2 J −1

LH , HL : K m ∑ g (k ) g (k ) > 0 . ∑ h(k ) h(k ) > 0 k =0

k =0

2 I −1

2 J −1

k =0

k =0

HH : K m ∑ g (k ) g (k ) > 0 . ∑ g (k ) g (k ) > 0

A

B

Figure 7: wavelet coefficients in level m using previous coefficients

For m = 0 we have Km = 255.

III. COMPRESSOR Since MPEG is designed for natural video, some changes are necessary when it is used with wavelet coefficients. In MPEG encoder each frame is encoded in one of the following methods: Intra: an "intra" fame is encoded by itself, using discrete cosine transform (DCT) and quantization. Inter: an "inter" frame is encoded with respect to other frames. Inter-frame predicted pictures (P-pictures) are coded with reference to the nearest previously coded I-picture or Ppicture, usually incorporating motion compensation to increase coding efficiency. In P-frames, the prediction residual errors are fed to the DCT block and quantized by a flat quantization matrix (all entries = 16). This is because the residual error has a flat spectrum. Since the wavelet coefficients in all sub-bands (except LL) can be considered as edge of objects in the original image, the sub-bands have motions similar to the original image. Hence, no change is necessary in P-frame encoding. The standard MPEG quantization matrix for I-frames is designed for natural images. A more suitable quantization

matrix for wavelet coefficients according their frequency characteristics is given in [25] (Figure 6). If one uses a quantization matrix other than the MPEG's default, the matrix is transmitted in picture level and no other change is necessary in the MPEG core. Also in [25], it is proposed to change the matrix scanning method to achieve a more efficient output bitrate. It is shown, however, that such a change to MPEG core gives negligible improvement over the MPEG's default zigzag scanning. Therefore, we only adopted their quantization matrixes: a modification that does not require any change in the MPEG core.

In Figure 10, the average SNR at the receiver is depicted for different number of senders in Carphone media transmission. By increasing the number of senders, the chance of finding reliable ones increases as well. The peak on 18 senders can be justified by statistical variations. The MDC schemes designed for CDN, with completely reliable servers, are again outperformed by the proposed method. 50 45 40 35

We streamed two QCIF video sequences (Table 1) over our simulated P2P network. We compared the performance of the proposed method with those of existing MDC schemes: IF_MDVC [23] and WBMDVC [21]. The upload bandwidth of each node is a sample of U(300, 600). Each node has an availability probability sampled from U(0, 1) and its download bandwidth is a sample of U(300, 800). U(X, Y) denotes the uniform distribution in [X, Y]. IPROMISE [27, 28] is selected as the multi-sender algorithm. TABLE 1 VIDEO CLIPS STREAMED OVER THE NETWORK

Motioncharacteristic High motion

Carphone Suzie

Normal motion

30 25 20 15

UUMDC 10

PDMD 5

WBMD

0 0.00

5.00

10.00

15.00

20.00

Time (s)

Figure 9: SNR of received Carphone clip in the case of 12 senders.

# of frames 382 (only the first 300 frame is used) 150

Figure 9 depicts SNR at the receiver for the Carphone clip with 12 senders. MPEG with GOP of is used for video compression. Substitution of a new sender with a failed one takes about 500 ms [27]. It is observed that the proposed method consistently provides better (or equal) quality of service at the receiver. Figure 8 presents the results of the same experiment for Suzie clip. 50

35 30 25

Ave. SNR (dB)

Sequence

SNR (dB)

IV. SIMULATION RESULTS

20 15 10 PDMD UUMDC

5

WBMDC

0

12

17

22

27

32

Number of Senders

Figure 10: Average SNR of received Carphone clip for different number of senders.

45 40

V. CONCLUSION

35 30 25 20 15

UUMD

10

PDMD

5

WBMD

0 0.00

2.00

4.00

6.00

8.00

10.00

Time (s)

Figure 8: SNR of received Suzie media in the case of 12 senders.

We developed a wavelet-based MDC scheme suited for media streaming over peer-to-peer networks. By having good properties from previous MDC and HLC schemes, the proposed method provides the receiver with a higher quality video. In a wavelet transform of a natural image, the LL bock is the most important part. If this part is lost in transmission, the quality will be significantly degraded. That is why the proposed method does not perform well when none of the senders has a high availability probability. Overcoming this limitation is the focus of our future research.

ACKNOWLEDGMENT The authors would like to thank Advanced Communications Research Institute (ACRI) and Iran Telecommunications Research Company (ITRC) for their support of this research activity. REFERENCES [1] E.K. Lua, J. Crowcroft, M. Pas, et al., “A survey and comparison of Peerto-Peer overlay network scheme”, IEEE Communication Survey and Tutorial, Vol. 7, No. 2, 2nd Quarter, pp: 72-93, 2005. [2] S. Androutsellis, D.Spinellis, ”A survey of Peer-to-Peer content distributed technologies”, ACM Computing Survey, pp: 335-371,Dec. 2004. [3] “Napster”. http://www.napster.com, 2002. [4] “Gnutella Protocol V0.4”, http://www.clip2.com. [5] I. Stocia, R. Morris, D. Karger, M. Kaashoek, H. Balakrisham, “Chord: A scalable peer-to-peer lookup service for internet applications”, IEEE/ACM Transaction on Networking, Vol. 11, Issue 1, pp: 17-32, Feb. 2003. [6] D. Hughes, J. Walkerdine, G. Coulson, S. Gibson, “Peer-to-Peer: is deviant behavior the norm on P2P file-sharing networks”, IEEE distributed systems online, Vol. 7, Issue 2, Feb. 2006. [7] Y. Guo, K. Suh, J. Kurose, D. Towsley, ”A Peer-to-Peer on-demand streaming and its performance evaluation”, Proceeding of International conference on Multimedia and Expo, Vol. 2, pp: 649-52, Jul. 2003. [8] Z. Xiang, Q. Zhang, W. Zhu, Z. Zhang, Y.Q. Zhang, “Peer-to-Peer based multimedia distribution service”, IEEE Transaction on Multimedia, No. 2, Vol. 6, Apr. 2004. [9] X. Jiang, Y. Dong, B. Bhargava, “GNUSTREAM: A P2P media streaming prototype”, Proceedings of ICME’03, Vol.2, pp: 325-8, Jul. 2003. [10] S. Kulkami, J. Markham, “Split and merge multicast: live media streaming with application level multicast”, IEEE International Conference on communication, Vol. 2, pp: 1292-98, May 2005. [11] S. Itaya, T. Enokido, M. Takizawa, “A scalable multimedia streaming based-on multi-source streaming concept”, Proceedings of 11th International Conference on Parallel and Distributed Systems, Vol.1, pp:15-21, Jul. 2005. [12] S. Itaya, T. Enokido, M. Takizawa, “A high performance multimedia streaming model on multi-source streaming approach in Peer-to-Peer networks”, 19th International Conference on Advanced Information Networking and Applications, Vol. 1, pp: 27-32, Mar. 2005. [13] Y. Shen, Z. Liu, S.S. Panwar, K.W. Ross, Y. Wang, “Streaming layered encoded video using peers”, IEEE International conference on Multimedia and Expo, pp. 100-4, Jul. 2005. [14] M. Zink, “P2P streaming using hierarchically encoded layered video”, Technical report TR-KOM-2003, Darmstadt University of Technology, Jan. 2003. [15] J. Lee, B.W. Dickinson, “Hierarchical video indexing and retrieval for sub-band coded video”, IEEE Transaction on Circuits and Systems for Video Technology, Vol. 10, Issue 5, pp: 824-9, Aug. 2000. [16] M. Zink, A. Mauthe, “P2P streaming using multiple description coded video”, Proceedings of 30th Euro-micro conference, pp: 240-247, 2004. [17] X. Xu, Y. Wang, S.S. Panwar, K.W. Ross, “A Peer-to-Peer video on demand system using multiple description coding and server diversity”, IEEE International Conference on Image Processing (ICIP), Vol. 3, pp: 1759-1762, Oct. 2004. [18] J. Apostolopoulos, T. Wong, W. Tan, S. Wee, “On multiple description streaming with content delivery network”, Proceedings of INFOCOM, Vol. 3, pp: 1736-45, Jun. 2002. [19] R. Bhagwan, S. Savage, G.M. Voeleker, “Understanding Availability”, Proceeding of the 2nd International workshop on Peer-to-Peer systems, Berkeley, CA, Feb. 2003. [20] V.K. Goyal, “Multiple description coding: Compression meets the network”, IEEE Signal Processing Magazine, Vol. 18, Issue 5, pp: 74-93, Sep. 2001. [21] T. Nguyen, A. Zakhor, “Distributed video streaming with forward error correction”, In 11th International Packet Video Workshop (PV2002), Pittsburgh, USA, 2002. [22] S. Khan, R. Schollmeier, E. Steinbach, “A performance comparison of multiple description video streaming in peer-to-peer and content delivery

networks”, Proceedings of IEEE International conference on Multimedia and Expo, Vol. 1, pp: 503-506, 27-30, Jun. 2004 [23] N. Franchi, M. Fumagalli, R. Lancini, S. Tubaro, “Multiple description video coding for scalable and robust transmission over IP”, IEEE Transaction on Circuits and Systems for Video Technology, No. 3, Vol. 15, pp: 321-334, Mar. 2005. [24] J. Kim, R. Mersereau, Y. Altunbasak, “Distributed video streaming using multiple description coding and unequal error protection”, IEEE Transaction on Image Processing, No. 7, Vol. 14, pp: 849-861, Jul. 2005. [25] P. Chang, T. Lu, “A scalable video compression technique based on wavelet transform and MPEG coding”, IEEE Transaction on Consumer Electronic, Vol. 45, Issue 3, pp. 788-93, Aug. 1999. [26] S. G. Mallat, "A theory of multiresolution signal decomposition: The wavelet representation", IEEE Transaction on Pattern Analysis and Machine Intelligence, No. 7, Vol. 11, pp. 674-93, Jul. 1989. [27] M.H.Firooz, K.Ronasi, M.R.Pakravan, A.R.Nasiri Avanaki, ”IPROMISE: Reliable Multi-Sender Algorithm for Peer-to-Peer Networks”, nd Proceeding of IEEE 2 International Conference on Communication System Software and Middleware (COMSWARE07), Bangalore, India, Jan. 7-12, 2007. [28] M.H.Firooz, K.Ronasi, M.R.Pakravan, A.R.Nasiri-Avanaki, “A Fast and Reliable Multi-Sender Algorithm for Peer-to-Peer Networks”, submitted to IEEE/ACM Transaction on Networking, Feb. 2007.