AUTOMATIC FISH SEGMENTATION VIA DOUBLE LOCAL THRESHOLDING FOR TRAWL-BASED UNDERWATER CAMERA SYSTEMS Meng-Che Chuang, Jenq-Neng Hwang Department of Electrical Engineering University of Washington Box 352500, Seattle, WA 98195, USA {mengche, hwang}@uw.edu

Kresimir Williams, Richard Towler Midwater Assessment & Conservation Engineering Alaska Fisheries Science Center, NOAA 7600 Sand Point Way NE, Seattle, WA 98115, USA {kresimir.williams, rick.towler}@noaa.gov

ABSTRACT

Conventional image segmentation algorithms can be roughly classified into two major approaches according to their primary segmentation criteria [2]. One approach uses change detection, if images along the time are available, as their primary criteria. The other approach constructs background information (through prior image sequence frames or modeling) and obtains results by comparing incoming frames with the background. In cases of background with spatial homogeneity, such as the underwater background when illuminated with LED light sources and distinctive foreground object gray level intensities, thresholding can be an effective tool. For this case, Wang et al. [3] imposed the notion of multiple thresholds and enforced the result by a boundary refinement technique. However, it requires intensive computational power and fails to detect small holes. As for segmentation in underwater images, due to the fast attenuation and non-uniformity of LED illumination, many foreground fish have relatively low contrast with underwater background, and moreover fish with similar ranges from the cameras can have significantly different lighting (see Fig. 5(a)). These factors make thresholding difficult. Furthermore, noise is created by bubbles and organic debris, which can easily be mistaken as real fish. Walther et al. [4] proposed an object detection approach based on saliency maps. However, the boundary of a target is unable to be tracked very accurately and the thresholded object may not exactly correspond to the true shape of the target. In this paper, an innovative algorithm using a histogram backprojection procedure is proposed. It was modified from the original version for solving location problems via color indexing [5] to be applicable for segmentation refining [6], on double local-thresholded images to ensure a reliable segmentation on fish shape boundaries. The thresholded results are further validated by area and variance criteria to remove unwanted objects. Finally, a post-processing step is also applied to further refine the segmentation. The paper is organized as follows: Section 2 briefly describes the configuration and functional modules of our Cam-trawl system for image capturing. Section 3

This paper describes an automatic segmentation algorithm for fish sampled by a trawl-based underwater camera system. To overcome the problem caused by very low brightness contrast between fish and their underwater background with dynamically changing luminance, our proposed algorithm adopts an innovative histogram backprojection procedure on double local-thresholded images to ensure a reliable segmentation on the fish shape boundaries. The thresholded results are further validated by area and variance criteria to remove unwanted objects. Finally, a post-processing step is applied to refine the segmentation. Promising results, as validated by expertgenerated ground truth data, were obtained via our proposed algorithm. Index Terms— fish segmentation, double local thresholding, histogram backprojection, midwater trawl. 1. INTRODUCTION The conservation and management of fish stocks requires fish abundance estimates, which often call for the use of bottom and midwater trawls. To address these needs, we developed the Cam-trawl [1], a self-contained stereocamera system fitted to the aft end of a trawl in place of the codend (i.e., capture bag) for image sequences capturing. The absence of the codend allows fish to pass unharmed to the environment after being sampled (image captured). The captured image data provide much of the information that is typically collected from fish that are retained by traditional trawl methods. Image-based sampling for fish abundance estimates generates vast amounts of data, which present challenges to data analysis. These challenges can be reduced by using automated image processing algorithms for automated detection, segmentation, tracking, length/area and size measurements, and classification. A successful development of these algorithms will greatly ease one of the most onerous steps in image-based sampling.

Input Image Gray-Level Morphological Gradient Double Local Thresholding

Bounding Region Determination

Double Otsu Thresholding

Mlow

Mhigh

Histogram Backprojection Thresholding By Area and Variance

Fig.1. The Cam-trawl underwater fish imaging system.

introduces our fish object segmentation algorithm. Section 4 shows some typical simulation results and statistics of the performance, followed by the conclusion in Section 5. 2. CAM-TRAWL SYSTEM The Cam-trawl represents a new class of midwater imaging sampler to study the marine environment. With ongoing development, however, the Cam-trawl is poised to become a standard marine surveying tool to provide a more holistic view of the marine environment, and improve the management of our marine resources. As shown in Fig. 1, the stereo-camera system consists of two high-resolution machine vision cameras, a series of LED strobes, a computer, microcontroller, sensors, and battery power supply. The cameras and battery pack are housed in separate 4-inch diameter titanium pressure housings, and the computer, microcontroller and sensors are placed in a single 6-inch diameter aluminum housing. The high-resolution high-sensitivity cameras capable of capturing 4 megapixel images at up to 15 frames per second (fps). The cameras are connected via a gigabit Ethernet to a Core 2 Duo PC with software to control the camera’s operation and to store the image data to a solid state hard disk drive. A full-featured software development kit (SDK) supports the core acquisition and control routines. The PC runs a customized Linux operating system, which allows precise control over what software and services are started depending on how the system is being used. 3. FISH OBJECT SEGMENTATION The proposed algorithm is divided into four steps, as shown in Fig. 2. The first step is double local

Post-Processing Object Mask

Fig.2. Flow chart of proposed algorithm.

thresholding. Next, binarization results from two different thresholds are effectively integrated using a technique based on histogram backprojection. After that, thresholding by area and variance removes noise and unwanted objects. Finally, a post-processing step is applied in order to refine the segmented object boundaries. 3.1. Double Local Thresholding For the double local thresholding method, we need to first detect the rough position and size of a fish. A gray-level morphological gradient operation with a 5 5 structuring element [7] is applied to the input image and the result is then processed with an adaptive thresholding technique using Otsu’s method [8], generating an initial binary mask. Note that this is only a preliminary segmentation of fish body, which needs to be carefully refined by the subsequent double local thresholding. Next, the local region around the detected objects has to be determined. The classic connected components algorithm [9] is applied first to mark the isolated local region in the object mask. Then, each isolated region is characterized by an oriented elliptic bounding box. Finally, the local region is determined by enlarging the oriented ellipse by a factor of 1.5 in both the major and minor axis. With these elliptic local regions, it is ready to perform the double local thresholding method. For each region, an adaptive threshold is selected using a variant of the Otsu’s method. To better preserve some dim fish, which have intensities close to the background, the threshold is adjusted to  x    p  L  , (1)

Fig.3. The basic concept of histogram backprojection.

where  is the Otsu threshold,  L is the mean of lower class, as determined by the Otsu threshold, divided by  in histogram. Using Eq. (1), two thresholds are generated. The high threshold  high is given by setting p  0.7 and the low threshold  low by setting p  1. These two thresholds result in two corresponding object masks M low and M high . A 3 3 median filter is then applied to these two object masks respectively to reduce the noises before going to the next histogram backprojection step.

Fig.4. Oriented bounding boxes for length measurement. TABLE I PRECISION AND RECALL OF SEGMENTATION IN 74 FRAMES Number of Precision Recall Targets (%) (%) 514 74.62 78.40 Table I: Precision and recall of segmentation in 74 frames TABLE II MEAN OF ABSOLUTE ERROR OF SIGNIFICANT TARGET LENGTH # Targets MAE of Length 189 10.69 % Table II: Mean of absolute error of significant target length

3.2. Histogram Backprojection

3.3. Thresholding by Area and Variance

The two object masks are merged with a model update approach in [6]. To check whether a pixel I(x,y), which is within the enlarged oriented elliptic bounding region of an object candidate, belongs to the foreground fish or the background, the histogram backprojection is used. First, the object masks M low and M high are used to derive two

In addition to using pixel values to refine the segmentation masks, the proposed algorithm also takes into account the area of an object and variance of pixel values within an object. The connected components algorithm is applied to mark each isolated region in the object mask, and the area of each region is determined. Those objects which are of area greater than an upper threshold or less than a lower threshold will be rejected. Object candidates are also examined by calculating the variance of pixels within each segmented objects. Since foreground objects (fish) are inclined to be more textured than the background or unwanted objects, the variance of the segmented object is likely to be larger.

16-bin gray-level histograms H low i  and H highi  ,

respectively. A ratio histogram of any gray-level bin value i is defined as  H i   (2) H R (i)  min  high ,1 .  H low i   A thresholding process is then applied to the backprojection of the ratio histogram H R (i) to obtain the final binary segmentation mask B(x,y). 1 if H R ( I ( x, y))   , (3) B( x, y )   0 otherwise where I x, y  denotes the pixel value at position x, y  and  denotes a threshold between 0 and 1. The fundamental assumption is a foreground fish body pixel has only slightly higher histogram values in H low i  than in H high i  . On the other hand, the ocean background

pixel will have significantly higher histogram values in H low i  than in H highi  , as illustrated in Fig. 3.

3.4. Post-Processing There may still exist some errors, such as gulfs or peninsulas, created at the boundaries of histogram backprojection refined objects, a cascaded morphological operations can be adopted to further refine the boundaries. More specifically, a closing and an opening morphological operation with a 7 7 structuring element are applied to the object mask. In this way, the object boundaries are smoothed without affecting the details of the shape information, and small noise regions can thus be removed.

(a)

(b)

Fig.5. (a) Cam-trawl captured images and (b) segmentation results.

4. SIMULATION RESULTS To evaluate the proposed algorithm, three sample image sequences consisting of 74 image frames are tested. These sample images are gray-level images under resolution 1024 ×1024. According to the hand-labeled ground truth, there are 514 targets in total to be segmented. All software development is done in Visual C++ and OpenCV 2.1 library. The performance of the proposed algorithm is measured in terms of precision and recall accuracy as well as the mean of absolute error (MAE) of the measured length of significant targets. The measured length of a segmented fish is defined as the Euclidean distance between head and tail. Only targets with length greater than 100 pixels are considered (there are 189 targets out of 514) since larger objects have more reliable ground truth. In our simulations, the length of an object is obtained by finding the contour bounding box with a routine in OpenCV library. An oriented rectangle of OpenCV data type CvBox2D is returned for each contour, as shown in Fig. 4, and the maximum between its width and height is used as the measured length. As shown in Table I, the proposed algorithm achieves a 74% precision and a 78% recall under very low-contrast underwater images. The MAE of measured length out of 189 targets is about 10%, as shown in Table II. Fig. 5 exhibits some typical original Cam-trawl captured images and the segmentation results.

thresholding scheme, to ensure a reliable segmentation on the fish shape boundaries. It achieves a 78% recall against the ground truth on the successful segmentation of fish, under very low-contrast underwater images. The measured length of segmented fish body has about a 10% error rate, which can be greatly reduced when stereo images pairs are jointly considered in the future. 6. REFERENCES [1]

[2]

[3]

[4] [5] [6]

[7]

[8]

5. CONCLUSION This paper proposes an innovative algorithm using histogram backprojection, based on a double local

[9]

K. Williams, R. Towler and C. Wilson, “Cam-trawl: a combination trawl and stereo-camera system,” Sea Technol., Vol. 51, No. 12, Dec. 2010. D. Zhang and G. Lu, “Segmentation of moving objects in image sequence: A review,” Cir., Sys., and Sig. Proc., Vol. 20, pp. 143, 2001. L. Wang and N. H.C. Yung, “Extraction of moving objects from their background based on multiple adaptive thresholds and boundary evaluation,” Intelli. Transport. Sys., IEEE Trans. on, Vol. 11, No. 1, pp. 40–51, Mar. 2010. D. Walther, D.R. Edgington and C. Koch, “Detection and tracking of objects in underwater video,” CVPR, Jun. 2004. M. J. Swain and D. H. Ballard, “Color indexing,” Int’l Journal of Computer Vision, 7:1, pp. 11-32, 1991. C. Kim and J.-N. Hwang, “Video object extraction for object-oriented applications,” The Journal of VLSI Signal Processing, Vol. 29, No. 1–2, pp. 7–21, Aug.–Sep. 2001. G. Bradski and A. Kaebler, Learning OpenCV: Computer Vision with the OpenCV Library, O'Reilly, 2008, pp.121123. N. Otsu, “A threshold selection method from gray-level histograms,” Sys., Man, Cyber., IEEE Trans. on., Vol. SMC-9, No. 1, pp. 62–66, Jan. 1979. R. M. Haralick and L. G. Shapiro, Computer and Robot Vision. Reading, MA: Addison-Wesley, 1992, pp. 28–48.

Author Guidelines for 8

camera's operation and to store the image data to a solid state hard disk drive. A full-featured software development kit (SDK) supports the core acquisition and.

775KB Sizes 0 Downloads 266 Views

Recommend Documents

Author Guidelines for 8
nature of surveillance system infrastructure, a number of groups in three ... developed as a Web-portal using the latest text mining .... Nguoi Lao Dong Online.

Author Guidelines for 8
The resulted Business model offers great ... that is more closely related to the business model of such an .... channels for the same physical (satellite, cable or terrestrial) ... currently under way is the integration of basic Internet access and .

Author Guidelines for 8
three structures came from the way the speaker and channel ... The results indicate that the pairwise structure is the best for .... the NIST SRE 2010 database.

Author Guidelines for 8
replace one trigger with another, for example, interchange between the, this, that is ..... Our own software for automatic text watermarking with the help of our ...

Author Guidelines for 8
these P2P protocols only work in wired networks. P2P networks ... on wired network. For example, IP .... advantages of IP anycast and DHT-based P2P protocol.

Author Guidelines for 8
Instant wireless sensor network (IWSN) is a type of. WSN deployed for a class ... WSNs can be densely deployed in battlefields, disaster areas and toxic regions ...

Author Guidelines for 8
Feb 14, 2005 - between assigned tasks and self-chosen “own” tasks finding that users behave ... fewer queries and different kinds of queries overall. This finding implies that .... The data was collected via remote upload to a server for later ..

Author Guidelines for 8
National Oceanic & Atmospheric Administration. Seattle, WA 98115, USA [email protected] .... space (CSS) representation [7] of the object contour is thus employed. A model set consisting of 3 fish that belong to ... two sets of descending-ordered l

Author Guidelines for 8
Digital circuits consume more power in test mode than in normal operation .... into a signature. Figure 1. A typical ..... The salient features and limitations of the ...

Author Guidelines for 8
idea of fuzzy window is firstly presented, where the similarity of scattering ... For years many approaches have been developed for speckle noise ... only a few typical non- square windows. Moreover, as the window size increases, the filtering perfor

Author Guidelines for 8
Ittiam Systems (Pvt.) Ltd., Bangalore, India. ABSTRACT. Noise in video influences the bit-rate and visual quality of video encoders and can significantly alter the ...

Author Guidelines for 8
to their uniqueness and immutability. Today, fingerprints are most widely used biometric features in automatic verification and identification systems. There exists some graph-based [1,2] and image-based [3,4] fingerprint matching but most fingerprin

Author Guidelines for 8
sequences resulting in a total or partial removal of image motion. ..... Add noise. Add targets. Performance Measurement System. Estimate. Residual offset.

Author Guidelines for 8
application requests without causing severe accuracy and performance degradation, as .... capacity), and (3) the node's location (host address). Each information ... service also sends a message to the meta-scheduler at the initialization stage ...

Author Guidelines for 8 - Research at Google
Feb 14, 2005 - engines and information retrieval systems in general, there is a real need to test ... IR studies and Web use investigations is a task-based study, i.e., when a ... education, age groups (18 – 29, 21%; 30 – 39, 38%, 40. – 49, 25%

Author Guidelines for 8
There exists some graph-based [1,2] and image-based [3,4] fingerprint matching but most fingerprint verification systems require high degree of security and are ...

Author Guidelines for 8
Suffering from the inadequacy of reliable received data and ... utilized to sufficiently initialize and guide the recovery ... during the recovery process as follows.

Author Guidelines for 8
smart home's context-aware system based on ontology. We discuss the ... as collecting context information from heterogeneous sources, such as ... create pre-defined rules in a file for context decision ... In order to facilitate the sharing of.

Author Guidelines for 8
affordable tools. So what are ... visualization or presentation domains: Local Web,. Remote Web ... domain, which retrieves virtual museum artefacts from AXTE ...

Author Guidelines for 8
*Department of Computer Science, University of Essex, Colchester, United Kingdom ... with 20 subjects totaling 800 VEP signals, which are extracted while ...

Author Guidelines for 8
that through a data driven approach, useful knowledge can be extracted from this freely available data set. Many previous research works have discussed the.

Author Guidelines for 8
3D facial extraction from volume data is very helpful in ... volume graph model is proposed, in which the facial surface ..... Mathematics and Visualization, 2003.

Author Guidelines for 8
Feb 4, 2010 - adjusted by the best available estimate of the seasonal coefficient ... seeing that no application listens on the port, the host will reply with an ...

Author Guidelines for 8
based systems, the fixed length low-dimension i-vectors are extracted by estimating the latent variables from ... machines (SVM), which are popular in i-vector based SRE system. [4]. The remainder of this paper is .... accounting for 95% of the varia