33rd Annual International Conference of the IEEE EMBS Boston, Massachusetts USA, August 30 - September 3, 2011

Matrix Completion Based ECG Compression Luisa F. Polania, Rafael E. Carrillo, Manuel Blanco-Velasco and Kenneth E. Barner

Abstract— An innovative electrocardiogram compression algorithm is presented in this paper. The proposed method is based on matrix completion, a new paradigm in signal processing that seeks to recover a low-rank matrix based on a small number of observations. The low-rank matrix is obtained via normalization of electrocardiogram records. Using matrix completion, the ECG data matrix is recovered from a few number of entries, thereby yielding high compression ratios comparable to those obtained by existing compression techniques. The proposed scheme offers a low-complexity encoder, good tolerance to quantization noise, and good quality reconstruction.

I. I NTRODUCTION An ECG is an important physiological signal for cardiac disease diagnostics. With the increasing use of modern electrocardiogram monitoring devices that generate vast amounts of data requiring huge storage capacity, ECG compression becomes mandatory to efficiently store and retrieve this data from medical database. ECG compression techniques can be classified into three categories [1]: Direct methods, transform methods and other compression methods. In the first category, the ECG samples are processed directly paying attention to the redundancy among them. In the second category, the wavelet transform-based methods play an interesting role due to their easy implementation and efficiency. The latter works in this area are characterized by hierarchical tree structures, such as embedded zero-tree wavelet (EZW) [2] and set partitioning in hierarchical tree (SPIHT) [3] protocols, which make use of the self-similarity of the wavelet transform across scales within a hierarchically decomposed wavelet tree. An interesting line of research focuses on transforming the original one-dimensional ECG waveforms into twodimensional information, followed by a processing stage using image processing tools. The idea of these methods is to exploit both intra-beat and inter-beat correlations. For example, Lee and Buckley [4] applied the DCT transform to an ECG data matrix composed of regular heartbeats and Bilgin [5] applied JPEG2000 compression to a similarly constructed matrix. Recently, we proposed in [6] a compressed sensing (CS) based ECG compression framework that utilizes distributed compressed sensing (see [7] and references therein) to exploit the inter-beat correlation structure. Following this line of thought, we propose a bidimensional L.F. Polania, R.E. Carrillo and K.E. Barner are with the Dept. of Electrical and Computer Engineering, University of Delaware, Newark, DE 19716 USA. e-mail: [email protected], [email protected] , [email protected]. M. Blanco-Velasco is with the Dept. Teor´ıa de la Se˜nal y Comunicaciones, Universidad de Alcal´a, Madrid, Spain. e-mail: [email protected].

978-1-4244-4122-8/11/$26.00 ©2011 IEEE

ECG compression scheme based on matrix completion, that exploits both intra and inter-beat correlations. Matrix completion refers to the problem of reconstructing a low-rank matrix from a small set of observed entries possibly corrupted by noise. Directly solving the rank minimization problem is NP hard. However, Cand`es and Recht in [8] extend the CS ideas (recovery of sparse signals) to the completion of low-rank matrices proposing a convex relaxation to the NP-hard problem by replacing the rank function with the nuclear norm of a matrix. Recently, many authors have proposed efficient algorithms for solving the low-rank matrix completion problem, such as Singular Value Threshholding [9] and Fixed Point Continuation algorithms (FPC) [10]. The main assumption of the matrix completion problem is that, to be recovered, the matrix has to be low-rank. To meet this requirement, the proposed method starts by forming a low-rank matrix in which each column corresponds to a length normalized ECG cycle. In the subsequent step, the sampling process forms an observed set by drawing uniformly at random from the formed matrix. The decoding process is performed using matrix completion followed by period de-normalization. The proposed scheme offers a lowcomplexity encoder and good tolerance to quantization noise. The performance of the proposed algorithm, in terms of reconstructed signal quality and compression ratio, is evaluated using the MIT-BIH Arrhythmia Database. II. R EVIEW OF M ATRIX C OMPLETION Matrix completion refers to the problem of reconstructing a low-rank matrix M from a small set of observed entries possibly corrupted by noise. A full rank matrix of dimensions (n × n) has n2 degrees of freedom and, therefore, it is not possible to estimate all its values from a small subset of entries in the matrix. However, when the matrix has a low rank (r), the degrees of freedom are r(2n − r), which is much smaller than n2 . Then, in the absence of noise, it is possible to recover a low-rank matrix by solving the following optimization problem min rank(X) s.t. Xij = Mij , for (i, j) ∈ Ω,

(1)

where X ∈ Rm×n is the variable matrix, rank(X) is the rank of the matrix X and Ω is the set of indices of the sample entries. However, this optimization problem is NP-hard and cannot be (expected to be) solved in polynomial time [8]. Motivated by compressed sensing [11], where minimizing the ℓ1 norm is the tightest convex relaxation to minimizing

1757

Fig. 1.

Block Diagram of the proposed method. (a) Encoder. (b) Decoder.

the ℓ0 norm, the following optimization problem is proposed min kXk∗ s.t. Xij = Mij , for (i, j) ∈ Ω,

(2)

where kXk∗ stands for the nuclear norm, the sum of the singular values of X. Adopting compressed sensing ideas, rank(·) of a matrix corresponds to ℓ0 norm of a vector, and nuclear norm to ℓ1 norm [8]. The nuclear norm is the best convex approximation of the rank function over the unit ball of matrices. This problem is a special case of the nuclear norm minimization problem min s.t.

kXk∗ AX = b,

s.t.

kAX − bk22 ≤ θ,

(4)

or its Lagrangian form 1 min λkXk∗ + kAX − bk22 2 where θ and λ are algorithmic parameters [10].

xn (m) = x ˆ(t∗ ),

(5)

III. C OMPRESSION S CHEME In this section, we present the proposed compression scheme. The block diagram of the proposed approach is presented in Fig. 1, where Fig. 1(a) depicts the encoder and Fig. 1(b) the decoder. Implementation details of the encoder and decoder are described below. A. Encoder design 1) Period Normalization: The main assumption of the matrix completion problem is that the matrix to be recovered has to be low-rank. It is natural to expect that a matrix whose columns or rows are highly correlated has low rank. Motivated by this, we introduce a period normalization step in order to exploit the quasi-periodic nature of the ECG

(6)

where x ˆ(t∗ ) is an interpolated version of the samples, and t∗ = (m ∗ N0 )/(Nn ),

(3)

where A : Rp×q → Rm is a linear map and b ∈ Rm . In the presence of noise, the constraint A(X) = b must be relaxed, resulting in either the problem min kXk∗

signal, shown in the high correlation between samples of adjacent beats. The peaks of QRS waves should be detected first to identify each heartbeat. Since each ECG period can have a different duration, we normalize them to the same length, using cubic spline interpolation. Let x = [x(0) x(1) . . . x(N0 −1)] denote an ECG cycle. The normalized cycle, denoted as xn = [xn (0) xn (1) . . . xn (Nn −1)], is computed as follows:

(7)

where N0 and Nn are the original and normalized period lengths, respectively. Once the heartbeats have been normalized, they are organized column-wise in a matrix. Apart from period normalization, there are alternative length equalization methods, such as zero-padding, which pads the short segments with zeros, or periodic-padding, which pads the short segments with a periodic extension. The 70% largest singular values of the matrices constructed by using these three length equalization techniques are shown in Fig. 2 on a semi-log scale. Although the three methods offer approximately low-rank representations, the set of singular values falls off more rapidly in the case of period normalization. These experiments were carried out over a single-lead ECG extracted from the record 117 from the MIT-BIH Arrhythmia Database. When reconstructing the original signal, the decoder requires the original periods as side information. Therefore, these periods are added to the header of the compressed file. 2) Sampling and Optimal Quantization: The ECG data are organized column-wise in a Nn ×Q matrix, denoted as X, where Q is the number of adjacent heartbeats encoded and Nn is the normalized length of each heartbeat. To reduce the dimensionality of the ECG data, we take p samples uniformly at random of the formed ECG matrix. We denote the sample vector as b = A(X) (8)

1758

the experiments are from the MIT-BIH arrhythmia database, sampled at 360 Hz with a resolution of 11 bits/sample. Let x and x ˆ be the N -dimensional vectors representing the original and reconstructed signals, respectively. The quality of the reconstructed signals is evaluated via the percent root mean square difference (PRD), defined as P RD = (kx − x ˆk/kxk) · 100, where the operator k · k denotes the Euclidean norm. The compression performance is evaluated by using the compression ratio (CR), defined as

5

10

Period Normalization Zero−padding Periodic−padding

4

Magnitude

10

3

10

2

10

CR = 1

10

0

50

100

150 200 Index

250

300

Fig. 2. Singular values of the matrix with period normalization (70% of the largest singular values). (a) Period normalization. (b) Zero-padding. (c) Periodic extension.

where A represents the random sampling operator from RNn ×Q to Rp . Although the length of the QRS complex is small compared to the length of the entire heart beat, its recovery in the decoder is important for diagnosis purposes. Therefore, we force the algorithm to use 30% of the total number of samples for the QRS sampling. To encode each sample, we utilize an optimal scalar quantizer designed with the Lloyd-Max algorithm with Bs bits per sample. Thus, the total number of bits used by the proposed method is pBs + QBp , where Bp is the number of bits used to encode the period of the heartbeats. B. Decoder design Although the nuclear norm minimization problem can be cast as a semidefinite programming problem, it can be computationally expensive to solve when the matrices are large. In this paper, we decide to use the algorithm proposed by Ma et. al in [10], which corresponds to a continuation (homotopy) technique to accelerate the convergence of the following fixed point iteration Y k = X k − τ g(X k ) X k+1 = Sτ µ (Y k ) where g(X k ) = A∗ (A(X k ) − b), A∗ being the adjoint operator of A, and Sv (Y ) is a shrinkage operator defined as Sv (·) = U Diag(σ)V T , where σ = max{σ − v, 0} and U Diag(σ)V T is the singular value decomposition of Y . The fixed point continuation iterative scheme offers robustness and good recoverability. The entire algorithm is detailed in [10]. The original heartbeat periods can be recovered by using the same transformation as in (6). IV. E XPERIMENTAL RESULTS This section describes experiments to evaluate the performance of the proposed scheme. The ECG data used in

11 × N , (p × Bs ) + (Q × Bp )

where N is the length of the input signal, 11 is the number of bits to encode each sample, Bs are the bits representing the measurements, p is the number of measurements, Q is the number of ECG cycles and Bp are the bits representing each period. In the first experiment, we compare the performance of the proposed algorithm with conventional SPIHT [3] for the records 100, 117 and 119. For each experiment, 30 repetitions are averaged. We use eight bits for the quantization of the measurements, Bs = 8, and eight bits for the heartbeat periods, Bp = 8. The dimensions of the resulting ECG images depend on the number of detected periods and their average length. Figure 3 shows the results of the first experiment. An overall evaluation of the records reveals that if the target CR is high, the proposed method is preferred since it significantly outperforms SPIHT, in this case. The heartbeats in the records 117 and 100 are very regular and, therefore, it is possible to get a good approximation of the low-rank matrix after the period normalization step. The algorithm also shows a good performance in the case of signals with extremely varying periods, such as the record 119. The proposed algorithm is also compared to other ECG coders through their reported performance in the literature. In Table I, we show PRD comparisons of different coding algorithms. As the results show, the proposed scheme exhibits better performance than well-known methods, such as those based on wavelet transforms [12], wavelet packets [2], [13] or the DCT transform [4]. For high compression ratios, the introduced method also outperforms algorithms, such as SPIHT, according to our first experiment. When compared to JPEG2000 [5], the ECG compression method based on matrix completion exhibits lower performance in terms of compression. Conversely, our method offers a less complex and computationally demanding encoder. The complexity of the encoder is dominated by the cost of the period normalization stage, assuming that fast sampling operators are available. Suppose L represents the cost of the period normalization for a single heartbeat, then the computational cost of the encoder is O(QL). In this work, visual study of error signal is also considered. The waveforms of record 117 given by the proposed compression scheme are visually evaluated in Fig. 4. The reconstructed signal remains close to the original signal and the error is equally distributed along the horizontal axis. This

1759

60

40

40 Record 100

Record 117 35 50

Proposed method SPIHT

Record 119 35

Proposed method SPIHT

30

Proposed method SPIHT

30

40 25 CR

CR

CR

25 30

20

20 15 20 15

10 10

10

5 0

1

2

3

4

PRD

Fig. 3.

5

0

6

2

4

6

8

PRD

10

5

12

4

5

6

7

PRD

8

9

10

11

12

Comparison of the proposed method with SPIHT. L: Record 117, M: Record 100, R: Record 119.

TABLE I

CR 8:1 8:1 8:1 8:1 8:1 10:1 10:1 23.61:1 23.61:1 19.65:1 19.65:1

PRD 1.18 2.6 3.9 0.86 2.18 1.03 2.5 9.92 8.4 6.15 6.01

Amplitude

Record 117 117 117 117 117 117 117 100 100 119 119

0 −200 −400

implies that the proposed method performs well locally and does not incorporate outliers in the reconstruction. Like most ECG compression algorithms that operate on 2D arrays, the proposed scheme relies on the assumption that the periods are gathered and organized in a matrix. This will cause some delay, at least one 2-D array collection time. It is clear that this behavior compromises strict real-time data transmission. However, the proposed scheme is useful in cases when one is only concerned with data storage, or minor delays are tolerable. V. C ONCLUSION We applied the matrix completion problem to ECG compression by constructing an approximately low rank matrix with ECG data. The results indicate that the presented algorithm compares favorably to other methods in the literature. The proposed scheme offers the low complexity encoder needed by holters and other ECG acquisition devices. Future directions for research will include the incorporation of an entropy coding stage and the design of an adaptive sampling method that will allow further reductions in the number of samples. R EFERENCES ´ Bravo, and [1] M. Blanco-Velasco, F. Cruz-Rold´an, F. L´opez, A. D. Mart´ınez, “A low computational complexity algorithm for ECG signal compression,” Medical Eng. Phys., vol. 26, no. 7, pp. 553 –568, Sept. 2004. [2] M.L. Hilton, “Wavelet and wavelet packet compression of electrocardiograms,” IEEE T-BME, vol. 44, no. 5, pp. 394 –402, May 1997. [3] Z. Lu, D. Y. Kim, and W. A. Pearlman, “Wavelet compression of ECG signals by the set partitioning in hierarchical trees algorithm,” IEEE T-BME, vol. 47, no. 7, pp. 849 –856, July 2000.

2

4

6

8

10 12 Seconds (a)

14

16

18

20

2

4

6

8

10 12 Seconds (b)

14

16

18

20

2

4

6

8

14

16

18

20

0 −200 −400

Amplitude

Algorithm Lu et. al [3] Hilton [2] Djohan et. al [12] Bilgin et. al [5] Proposed method Bilgin et. al [5] Proposed method Bradie et. al [13] Proposed method Bradie et. al [13] Proposed method

Amplitude

C OMPARISON OF DIFFERENT ECG COMPRESSION ALGORITHMS

10 0 −10 10

Seconds (c)

12

Fig. 4. Compression waveform of record 117 for CR=10, PRD=2.51. (a) Original signal. (b) Reconstructed signal. (c) Reconstruction error.

[4] H. Lee and K.M. Buckley, “ECG data compression using cut and align beats approach and 2-D transforms,” IEEE T-BME, vol. 46, no. 5, pp. 556 –564, May 1999. [5] A. Bilgin, M.W. Marcellin, and M.I. Altbach, “Compression of electrocardiogram signals using JPEG2000,” IEEE Transactions on Consumer Electronics, vol. 49, no. 4, pp. 833 – 840, 2003. [6] L. F. Polania, R. E. Carrillo, M. Blanco-Velasco, and K. E. Barner, “Compressed sensing based method for ECG compression,” in Proc., IEEE ICASSP, Prague, Czech Republic, May 2011. [7] D. Baron, M. Duarte, M. Wakin, S. Sarvotham, and R. Baraniuk, “Distributed compressive sensing,” Preprint, Jan. 2009. [8] E.J. Cand`es and B. Recht, “Exact matrix completion via convex optimization,” Foundations of Computational Mathematics, vol. 9, no. 6, pp. 717–772, 2009. [9] E.J. Cand`es J.-F. Cai and Z. Shen, “A singular value thresholding algorithm for matrix completion,” SIAM J. on Optimization, vol. 20, pp. 1956–1982, March 2010. [10] S. Ma, D. Goldfarb, and L. Chen, “Fixed point and Bregman iterative methods for matrix rank minimization,” Tech. Rep., Department of IEOR, Columbia University, 2008. [11] E.J. Cand`es and M.B. Wakin, “An introduction to compressive sampling,” Signal Processing Magazine, IEEE, vol. 25, no. 2, pp. 21 –30, Mar. 2008. [12] A. Djohan, T.Q. Nguyen, and W.J. Tompkins, “ECG compression using discrete symmetric wavelet transform,” in IEEE 17th Annual Conference Engineering in Medicine and Biology Society, Sept. 1995, vol. 1, pp. 167 –168 vol.1. [13] B. Bradie, “Wavelet packet-based compression of single lead ECG,” IEEE T-BME, vol. 43, no. 5, pp. 493 –501, May 1996.

1760

Matrix Completion Based ECG Compression

Sep 3, 2011 - coder, good tolerance to quantization noise, and good quality reconstruction. ... A full rank matrix of dimensions (n × n) has n2 degrees of freedom and ..... methods for matrix rank minimization,” Tech. Rep., Department of.

574KB Sizes 0 Downloads 235 Views

Recommend Documents

Fast Matrix Completion Without the Condition Number
Jun 26, 2014 - A successful scalable algorithmic alternative to Nuclear Norm ..... of noise addition to ensure coherence already gets rid of one source of the condition .... textbook version of the Subspace Iteration algorithm (Power Method).

Fast and Accurate Matrix Completion via Truncated ... - IEEE Xplore
achieve a better approximation to the rank of matrix by truncated nuclear norm, which is given by ... relationship between nuclear norm and the rank of matrices.

Matrix Completion by Truncated Nuclear Norm ...
missing values presented in the data. The problem of es- timating missing values in a matrix, or matrix completion, has received considerable interest recently [5, 6, 14, 7, 16,. 10]. The visual data, such as images, is probably of low rank structure

Dependency Tree Based Sentence Compression
training data. Still, there are few unsupervised meth- ods. For example, Hori & Furui (2004) introduce a scoring function which relies on such informa- tion sources as word significance score and language model. A compression of a given length which

Segmentation-based CT image compression
The existing image compression standards like JPEG and JPEG 2000, compress the whole image as a single frame. This makes the system simple but ...

Example-based Image Compression - Research at Google
Index Terms— Image compression, Texture analysis. 1. ..... 1The JPEG2000 encoder we used for this comparison was Kakadu Soft- ware version 6.0 [10]. (a).

Optical Flow-based Video Completion in Spherical ...
In addition to the distortion on each single spherical image, the motion pattern is also special in spherical image frames. It has two properties. First, pixels on spherical images can only move along the spherical surfaces. Such movements are projec

Knowledge Base Completion via Search-Based Question ... - CiteSeerX
*[email protected]. 1{gabr .... 86%. Table 1: Incompleteness of Freebase for some relations that ap- ... 1; Section 2.1), using a form of distant supervision [13].

Sample Degree Completion based on Anne's updates-SP18admits ...
Page 1 of 1. Sample Degree Roadmap. Spring Quarter 2018 admitted students. Due to Semester Conversion, students who start the MS Ed-OTL program in Spring Quarter. 2018 will fall under our transition period and will complete their degree under the sem

Knowledge Base Completion via Search-Based Question ... - CiteSeerX
particular, for each entity attribute, we learn the best set of queries to ask, such ..... [1] The Mothers of Invention – Wikipedia, the free encyclopedia. The Mothers .... Austria, but if the context is Zappa married his wife Gail in 1967, it is m

SIFT-BASED IMAGE COMPRESSION Huanjing Yue1 ...
However, the SIFT descriptors consume a lot of computing resources. For efficient ..... for internet or cloud applications where a large-scale image set is always ...

Anchors-based lossless compression of progressive triangle meshes
PDL Laboratory, National University of Defense Technology, China. [email protected] ..... Caltech Multi-Res Modeling Group. References. [1] P. Alliez ...

Factorization-based Lossless Compression of ... - Research at Google
A side effect of our approach is increasing the number of terms in the index, which ..... of Docs in space Θ. Figure 1 is an illustration of such a factor- ization ..... 50%. 60%. 8 iterations 35 iterations. C o m p re ssio n. R a tio. Factorization

Denoising via MCMC-based Lossy Compression
denoising, for the setting of a stationary ergodic source corrupted by additive .... vides an alternative and more implicit method for learning the required source ..... To each reconstruction sequence yn ∈ ˆXn, assign energy. E(yn). [Hk(yn) + ...

The progressive mesh compression based on ...
Jul 14, 2007 - more connected regions, and the process that decomposes a model into visually ... proach to decoding compressed mesh data is independent of the encoding .... function center(γ) is the barycenter of γ; halfX, half. Y, and halfZ ...

Fovea Window for Wavelet-based Compression
reducing the redundancy of the data transmitted. .... types of compression, namely, lossy and lossless compression ... only the center pixel of the fovea region.

The progressive mesh compression based on ...
Jul 14, 2007 - cent points of the contour, and LENedge is the average edge length of ..... nents should be considered for selection and delivery, the visibility ...

Factorization-based Lossless Compression of Inverted ...
the term-document matrix, resulting in a more compact inverted in- ... H.3.1 [Information Storage And Retrieval]: Indexing methods .... an approximate solution.

Data Compression
Data Compression. Page 2. Huffman Example. ASCII. A 01000001. B 01000010. C 01000011. D 01000100. E 01000101. A 01. B 0000. C 0001. D 001. E 1 ...

Three-Phase AC–AC Power Converters Based on Matrix Converter ...
Whoops! There was a problem previewing this document. Retrying... Download. Connect more apps... Try one of the apps below to open or edit this item. Three-Phase AC–AC Power Converters Based on Matrix Converter Topology.pdf. Three-Phase AC–AC Pow

Relevance feedback based on non-negative matrix ... - IEEE Xplore
Abstract: As a powerful tool for content-based image retrieval, many techniques have been proposed for relevance feedback. A non-negative matrix factorisation ...

Similarity-based Clustering by Left-Stochastic Matrix Factorization
Figure 1: Illustration of conditions for uniqueness of the LSD clustering for the case k = 3 and for an LSDable K. ...... 3D face recognition using Euclidean integral invariants signa- ture. Proc. ... U. von Luxburg. A tutorial on spectral clustering

Similarity-based Clustering by Left-Stochastic Matrix Factorization
Journal of Machine Learning Research 14 (2013) 1715-1746 ..... Figure 1: Illustration of conditions for uniqueness of the LSD clustering for the case k = 3 and.